Re: [VOTE] Apache Apex Malhar Release 3.5.0 (RC1)

2016-08-29 Thread Thomas Weise
Hi Justin,

Thanks for the thorough inspection. Please see comments below.

David, Siyuan, there are questions for you as well.

Thanks,
Thomas


On Mon, Aug 29, 2016 at 7:16 PM, Justin Mclean 
wrote:

> Hi,
>
> Sorry but -1 until license/copyright issues resolved.
>
> I checked:
> - signature and hashes good
> - LICENSE and NOTICE good
> - release contains content not compatible with Apache license (see below)
> - no unexpected binary files
> - all source files have ASF headers
>
> I’m fairly sure that bundling this [1] is not allowed. Project Gutenberg
> is generally considered public domain but it not listed as an Apache
> compatible license here [3] and it is not an OSI approved license [4]. May
> be best to ask on legal discuss for clarification.
>
> You may also want to replace the content of [2] with something else as it
> copyright Time magazine. (I thought this has been raised before?). These
> files [5][6][7] also look to have similar copyright issues.
>
> Thanks,
> Justin
>
> 1. apache-apex-malhar-3.5.0/library/src/test/resources/wordcount.txt
>

Author: David Yan 
Date:   Thu Jun 23 14:37:29 2016 -0700

David, did you find something that indicates that we can use this?


> 2. apache-apex-malhar-3.5.0/demos/wordcount/src/main/
> resources/samplefile.txt
>

There was a file with same content that was removed in 3.2.0 and you
brought it up back then. It turns out that this existed in another
location.. ouch. I will submit a change to remove the content.


> 3. http://www.apache.org/legal/resolved.html#category-a
> 4. https://opensource.org/licenses/alphabetical
> 5. apache-apex-malhar-3.5.0/demos/highlevelapi/src/test/
> resources/sampletweets.txt
> 6 apache-apex-malhar-3.5.0/stream/src/test/resources/sampletweets.txt
>

commit 266b04116760dbd4d5cad6b4102b06153ac96a5f
Author: Siyuan Hua 
Date:   Tue Jul 12 11:57:09 2016 -0700


> 7. apache-apex-malhar-3.5.0/contrib/src/test/resources/
> com/datatorrent/contrib/romesyndication/datatorrent_feed_updated.rss
> 8. apache-apex-malhar-3.5.0/contrib/src/test/resources/
> com/datatorrent/contrib/romesyndication/datatorrent_feed.rss


These files (7./8.) were already replaced when the issue first came up with
3.2.0, here is the diff:

https://github.com/apache/apex-malhar/commit/75dc74fe33740cd6d2e3432a4248d95333412648

Content is from DT blog site, do you see an issue with it?


[jira] [Comment Edited] (APEXMALHAR-2130) implement scalable windowed storage

2016-08-29 Thread Chandni Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447840#comment-15447840
 ] 

Chandni Singh edited comment on APEXMALHAR-2130 at 8/30/16 3:04 AM:


Note: The main change in ManagedState which is required here is that 
timeBuckets (Window time in your example) is now computed outside ManagedState. 
TimeBuckets were being computed by TimeBucketAssigner within ManagedState but 
now it will be provided to it.


Since event time is arbitrary, unlike processing time, the actual key 
representing the timebucket cannot be assumed a natural sequence. However, 
TimeBucketAssigner.getTimeBucketAndAdjustBoundaries seems to return a long that 
is sequential starting from 0. We want to make the actual timebucket key based 
on the actual event window timestamp. Chandni Singh Will this break anything?

Answer: No it will not break anything. The time here is event time and this 
does NOT assume that events are received in order. Based on event time, this 
method creates timebucket. In your use case, the time bucket is computed 
outside ManagedState so there are 2 ways to approach it:
 - create a special TimeBucketAssigner which will just return the input Window 
for the event. It will not further compute timebucket.
 - make TimeBucketAssigner an optional property in AbstractManagedStateImpl. If 
it is null, then the time argument is used as timebucket save in Bucket.

>>>
Expiring and purging are done very differently and should be based on 3. 
Managed State should determine whether to purge a timebucket based on whether 
an Apex window is committed and whether all event windows that belong to that 
timebucket are marked "deleted" for that Apex window.

Answer: This is handled by TimeBucketAssigner again. I don't think much change 
is needed here. TimeBucketAssigner computes a timeBucket (in your case, this 
corresponds to Window time) and checks if the oldest buckets need to be purged 
(line 132 - 133). It figures out the lowest purgeable timebucket. In the 
endWindow, it informs IncrementalCheckpointManager, that it can delete all the 
timebuckets<=lowestPurgeableTimeBucket. However, IncrementalCheckpointManager 
deletes the data up to that timebucket only when the window in which it was 
request to be purged gets committed. So this will remain the same for you as 
well.

I think this can also by achieved by creating a special TimeBucketAssigner and 
overriding a few methods.



was (Author: csingh):
Note: The main change in ManagedState which is required here is that 
timeBuckets (Window time in your example) is now computed outside ManagedState. 
TimeBuckets were being computed by TimeBucketAssigner within ManagedState but 
now it will be provided to it.

Since event time is arbitrary, unlike processing time, the actual key 
representing the timebucket cannot be assumed a natural sequence. However, 
TimeBucketAssigner.getTimeBucketAndAdjustBoundaries seems to return a long that 
is sequential starting from 0. We want to make the actual timebucket key based 
on the actual event window timestamp. Chandni Singh Will this break anything?

Answer: No it will not break anything. The time here is event time and this 
does NOT assume that events are received in order. Based on event time, this 
method creates timebucket. In your use case, the time bucket is computed 
outside ManagedState so there are 2 ways to approach it:
 - create a special TimeBucketAssigner which will just return the input Window 
for the event. It will not further compute timebucket.
 - make TimeBucketAssigner an optional property in AbstractManagedStateImpl. If 
it is null, then the time argument is used as timebucket save in Bucket.

Expiring and purging are done very differently and should be based on 3. 
Managed State should determine whether to purge a timebucket based on whether 
an Apex window is committed and whether all event windows that belong to that 
timebucket are marked "deleted" for that Apex window.

Answer: This is handled by TimeBucketAssigner again. I don't think much change 
is needed here. TimeBucketAssigner computes a timeBucket (in your case, this 
corresponds to Window time) and checks if the oldest buckets need to be purged 
(line 132 - 133). It figures out the lowest purgeable timebucket. In the 
endWindow, it informs IncrementalCheckpointManager, that it can delete all the 
timebuckets<=lowestPurgeableTimeBucket. However, IncrementalCheckpointManager 
deletes the data up to that timebucket only when the window in which it was 
request to be purged gets committed. So this will remain the same for you as 
well.

I think this can also by achieved by creating a special TimeBucketAssigner and 
overriding a few methods.


> implement scalable windowed storage
> ---
>
> Key: APEXMALHAR-2130
> URL: 

[jira] [Commented] (APEXMALHAR-2130) implement scalable windowed storage

2016-08-29 Thread Chandni Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447840#comment-15447840
 ] 

Chandni Singh commented on APEXMALHAR-2130:
---

Note: The main change in ManagedState which is required here is that 
timeBuckets (Window time in your example) is now computed outside ManagedState. 
TimeBuckets were being computed by TimeBucketAssigner within ManagedState but 
now it will be provided to it.

Since event time is arbitrary, unlike processing time, the actual key 
representing the timebucket cannot be assumed a natural sequence. However, 
TimeBucketAssigner.getTimeBucketAndAdjustBoundaries seems to return a long that 
is sequential starting from 0. We want to make the actual timebucket key based 
on the actual event window timestamp. Chandni Singh Will this break anything?

Answer: No it will not break anything. The time here is event time and this 
does NOT assume that events are received in order. Based on event time, this 
method creates timebucket. In your use case, the time bucket is computed 
outside ManagedState so there are 2 ways to approach it:
 - create a special TimeBucketAssigner which will just return the input Window 
for the event. It will not further compute timebucket.
 - make TimeBucketAssigner an optional property in AbstractManagedStateImpl. If 
it is null, then the time argument is used as timebucket save in Bucket.

Expiring and purging are done very differently and should be based on 3. 
Managed State should determine whether to purge a timebucket based on whether 
an Apex window is committed and whether all event windows that belong to that 
timebucket are marked "deleted" for that Apex window.

Answer: This is handled by TimeBucketAssigner again. I don't think much change 
is needed here. TimeBucketAssigner computes a timeBucket (in your case, this 
corresponds to Window time) and checks if the oldest buckets need to be purged 
(line 132 - 133). It figures out the lowest purgeable timebucket. In the 
endWindow, it informs IncrementalCheckpointManager, that it can delete all the 
timebuckets<=lowestPurgeableTimeBucket. However, IncrementalCheckpointManager 
deletes the data up to that timebucket only when the window in which it was 
request to be purged gets committed. So this will remain the same for you as 
well.

I think this can also by achieved by creating a special TimeBucketAssigner and 
overriding a few methods.


> implement scalable windowed storage
> ---
>
> Key: APEXMALHAR-2130
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2130
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: bright chen
>Assignee: David Yan
>
> This feature is used for supporting windowing.
> The storage needs to have the following features:
> 1. Spillable key value storage (integrate with APEXMALHAR-2026)
> 2. Upon checkpoint, it saves a snapshot for the entire data set with the 
> checkpointing window id.  This should be done incrementally (ManagedState) to 
> avoid wasting space with unchanged data
> 3. When recovering, it takes the recovery window id and restores to that 
> snapshot
> 4. When a window is committed, all windows with a lower ID should be purged 
> from the store.
> 5. It should implement the WindowedStorage and WindowedKeyedStorage 
> interfaces, and because of 2 and 3, we may want to add methods to the 
> WindowedStorage interface so that the implementation of WindowedOperator can 
> notify the storage of checkpointing, recovering and committing of a window.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2205) State management benchmark

2016-08-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447785#comment-15447785
 ] 

ASF GitHub Bot commented on APEXMALHAR-2205:


Github user brightchen closed the pull request at:

https://github.com/apache/apex-malhar/pull/386


> State management benchmark
> --
>
> Key: APEXMALHAR-2205
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2205
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: bright chen
>Assignee: bright chen
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #386: APEXMALHAR-2205 State management benchmark

2016-08-29 Thread brightchen
GitHub user brightchen reopened a pull request:

https://github.com/apache/apex-malhar/pull/386

APEXMALHAR-2205 State management benchmark



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2205

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/386.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #386


commit eea3d04910e4b06983cacb98a1e0d1429d0f968d
Author: brightchen 
Date:   2016-08-26T23:09:12Z

APEXMALHAR-2205 State management benchmark




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] apex-malhar pull request #386: APEXMALHAR-2205 State management benchmark

2016-08-29 Thread brightchen
Github user brightchen closed the pull request at:

https://github.com/apache/apex-malhar/pull/386


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2205) State management benchmark

2016-08-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447784#comment-15447784
 ] 

ASF GitHub Bot commented on APEXMALHAR-2205:


GitHub user brightchen reopened a pull request:

https://github.com/apache/apex-malhar/pull/386

APEXMALHAR-2205 State management benchmark



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2205

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/386.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #386


commit eea3d04910e4b06983cacb98a1e0d1429d0f968d
Author: brightchen 
Date:   2016-08-26T23:09:12Z

APEXMALHAR-2205 State management benchmark




> State management benchmark
> --
>
> Key: APEXMALHAR-2205
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2205
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: bright chen
>Assignee: bright chen
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Apache Apex Malhar Release 3.5.0 (RC1)

2016-08-29 Thread Justin Mclean
Hi,

Sorry but -1 until license/copyright issues resolved.

I checked:
- signature and hashes good
- LICENSE and NOTICE good
- release contains content not compatible with Apache license (see below)
- no unexpected binary files
- all source files have ASF headers

I’m fairly sure that bundling this [1] is not allowed. Project Gutenberg is 
generally considered public domain but it not listed as an Apache compatible 
license here [3] and it is not an OSI approved license [4]. May be best to ask 
on legal discuss for clarification.

You may also want to replace the content of [2] with something else as it 
copyright Time magazine. (I thought this has been raised before?). These files 
[5][6][7] also look to have similar copyright issues.

Thanks,
Justin

1. apache-apex-malhar-3.5.0/library/src/test/resources/wordcount.txt
2. apache-apex-malhar-3.5.0/demos/wordcount/src/main/resources/samplefile.txt
3. http://www.apache.org/legal/resolved.html#category-a
4. https://opensource.org/licenses/alphabetical
5. 
apache-apex-malhar-3.5.0/demos/highlevelapi/src/test/resources/sampletweets.txt
6 apache-apex-malhar-3.5.0/stream/src/test/resources/sampletweets.txt
7. 
apache-apex-malhar-3.5.0/contrib/src/test/resources/com/datatorrent/contrib/romesyndication/datatorrent_feed_updated.rss
8. 
apache-apex-malhar-3.5.0/contrib/src/test/resources/com/datatorrent/contrib/romesyndication/datatorrent_feed.rss

Re: [VOTE] Apache Apex Malhar Release 3.5.0 (RC1)

2016-08-29 Thread Vlad Rozov

+1 (binding)

- checked signatures
- build with "MAVEN_OPTS=-XX:MaxPermSize=128m mvn apache-rat:check 
verify -Dlicense.skip=false -Pall-modules -DskipTests package"

- LICENSE and NOTICE correct

Thank you,

Vlad

On 8/28/16 21:59, Thomas Weise wrote:

Dear Community,

Please vote on the following Apache Apex Malhar 3.5.0 release candidate.

This is a source release with binary artifacts published to Maven.

This release is based on Apex Core 3.4 and comes with 61 resolved issues.

The release advances the high level stream API to support stateful
transformations with Beam style windowing semantics. The demo package has
examples for usage of the API. There are also important improvements to
underlying operator state management components, which are functional first
cut and will be enhanced in upcoming releases, such as WindowOperator,
spillable collections and incremental state saving.

The release also adds several new operators.

List of all issues fixed: https://s.apache.org/5vQi

Staging directory (new dist directories don't have access sorted out yet):
https://dist.apache.org/repos/dist/dev/apex/apache-apex-malhar-3.5.0-RC1/
Source zip:
https://dist.apache.org/repos/dist/dev/apex/apache-apex-malh
ar-3.5.0-RC1/apache-apex-malhar-3.5.0-source-release.zip
Source tar.gz:
https://dist.apache.org/repos/dist/dev/apex/apache-apex-malh
ar-3.5.0-RC1/apache-apex-malhar-3.5.0-source-release.tar.gz
Maven staging repository:
https://repository.apache.org/content/repositories/orgapacheapex-1016/

Git source:
https://git-wip-us.apache.org/repos/asf?p=apex-malhar.git;a=
commit;h=refs/tags/v3.5.0-RC1
  (commit: f96f0025f2bc27dff79dd95e9f88d7a43bba6c41)

PGP key:
http://pgp.mit.edu:11371/pks/lookup?op=vindex=t...@apache.org
KEYS file:
https://dist.apache.org/repos/dist/release/apex/KEYS

More information at:
http://apex.apache.org

Please try the release and vote; vote will be open for at least 72 hours.

[ ] +1 approve (and what verification was done)
[ ] -1 disapprove (and reason why)

http://www.apache.org/foundation/voting.html

How to verify release candidate:

http://apex.apache.org/verification.html

Thanks,
Thomas





[jira] [Commented] (APEXMALHAR-2130) implement scalable windowed storage

2016-08-29 Thread Timothy Farkas (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447700#comment-15447700
 ] 

Timothy Farkas commented on APEXMALHAR-2130:


[~davidyan] As far I can tell what you have implemented ontop of spillable 
datastructures is 95% of the way there. I'm not sure what has changed.

1. SpillableByteMap already supports deleting a key by setting the value for a 
key to be an empty byte array. So that's there. This can easily be extended to 
delete all the values in SpillableArrayListMultimap.

2. The only missing piece is the ability to iterate over the set of keys in a 
map. This can be done with another SpillableByteMap let's call this map the 
linkedListMap. The key of the linkedListMap represents the current node, the 
value represents the next node. You then keep track of your "head" key and 
iterate by taking your current node, and getting the value. The value then 
becomes the current node and so on. When the value for the current node is null 
you are done traversing the list. You can take this and wrap it in an iterator.


> implement scalable windowed storage
> ---
>
> Key: APEXMALHAR-2130
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2130
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: bright chen
>Assignee: David Yan
>
> This feature is used for supporting windowing.
> The storage needs to have the following features:
> 1. Spillable key value storage (integrate with APEXMALHAR-2026)
> 2. Upon checkpoint, it saves a snapshot for the entire data set with the 
> checkpointing window id.  This should be done incrementally (ManagedState) to 
> avoid wasting space with unchanged data
> 3. When recovering, it takes the recovery window id and restores to that 
> snapshot
> 4. When a window is committed, all windows with a lower ID should be purged 
> from the store.
> 5. It should implement the WindowedStorage and WindowedKeyedStorage 
> interfaces, and because of 2 and 3, we may want to add methods to the 
> WindowedStorage interface so that the implementation of WindowedOperator can 
> notify the storage of checkpointing, recovering and committing of a window.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #390: Remove obsolete japicmp exclude, use latest p...

2016-08-29 Thread tweise
Github user tweise closed the pull request at:

https://github.com/apache/apex-malhar/pull/390


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] apex-malhar pull request #390: Remove obsolete japicmp exclude, use latest p...

2016-08-29 Thread tweise
GitHub user tweise opened a pull request:

https://github.com/apache/apex-malhar/pull/390

Remove obsolete japicmp exclude, use latest patch version.

@vrozov please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tweise/apex-malhar japicmp

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/390.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #390


commit 3e2719ff213f7ea96fa20c3a6d5f811b4df08e78
Author: Thomas Weise 
Date:   2016-08-30T01:33:06Z

Remove obsolete japicmp exclude, use latest patch version.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] apex-core pull request #377: Remove obsolete japicmp exclude, use latest pat...

2016-08-29 Thread tweise
GitHub user tweise opened a pull request:

https://github.com/apache/apex-core/pull/377

Remove obsolete japicmp exclude, use latest patch version.

@vrozov please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tweise/apex-core japicmp

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-core/pull/377.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #377


commit a09d3daa3f2cbfdcb8a04eebce2f3a0f56e256d3
Author: Thomas Weise 
Date:   2016-08-30T01:22:23Z

Remove obsolete japicmp exclude, use latest patch version.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2130) implement scalable windowed storage

2016-08-29 Thread David Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447624#comment-15447624
 ] 

David Yan commented on APEXMALHAR-2130:
---

We are in the middle of the process of using [~timothyfarkas]'s spillable data 
structures to implement the state storage for the WindowedOperator. We need 
something that supports the equivalence of Map> and 
Map, where Window is event-time based window, and recovery is based 
on Apex windows. 

There are some gaps in the current state from what we need, most notably:

1. Getting all keys given a window from Map>
2. Getting all windows from Map
3. Deleting a window from Map> and from Map
4. Deleting a key given a window from Map>

Because of the above required features, implementing Map> 
with a SpillableByteMap, V> in conjunction with a 
SpillableArrayListMultimap will not work.

We are considering the following:

To support 1 and 2:
- Add the support of getting all keys >= given key by taking advantage of the 
FileAccess.FileReader.seek() and next() method and expose the functionality in 
the Bucket interface.
- The seek() and next() need to take a timebucket. That means in order to 
support 1, we need to have the ability to derive the timebucket from the 
event-time window, and have SpillableByteMap to support user provided mapping 
from Key to time bucket. (If such mapping is provided, time bucket will not be 
assumed -1 any more). 
- To support 2, we also need to add functionality of getting the list of all 
timeBuckets.
- Since event time is arbitrary, unlike processing time, the actual key 
representing the timebucket cannot be assumed a natural sequence. However, 
TimeBucketAssigner.getTimeBucketAndAdjustBoundaries seems to return a long that 
is sequential starting from 0. We want to make the actual timebucket key based 
on the actual event window timestamp. [~csingh] Will this break anything?

To support 3 and 4:

- We are thinking of a special valueSlice that denotes a deleted key. When a 
key is deleted, we just set the value to be the special valueSlice. The get 
methods will also handle it accordingly.
- Expiring and purging are done very differently and should be based on 3. 
Managed State should determine whether to purge a timebucket based on whether 
an Apex window is committed and whether all event windows that belong to that 
timebucket are marked "deleted" for that Apex window.

As you can see, going ahead with this will require some surgery on existing 
ManagedState and Spillable data structures.
This is based on my limited knowledge on Managed State so please pardon me and 
correct me if my statements don't make sense.

[~csingh] [~timothyfarkas] Please comment.


> implement scalable windowed storage
> ---
>
> Key: APEXMALHAR-2130
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2130
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: bright chen
>Assignee: David Yan
>
> This feature is used for supporting windowing.
> The storage needs to have the following features:
> 1. Spillable key value storage (integrate with APEXMALHAR-2026)
> 2. Upon checkpoint, it saves a snapshot for the entire data set with the 
> checkpointing window id.  This should be done incrementally (ManagedState) to 
> avoid wasting space with unchanged data
> 3. When recovering, it takes the recovery window id and restores to that 
> snapshot
> 4. When a window is committed, all windows with a lower ID should be purged 
> from the store.
> 5. It should implement the WindowedStorage and WindowedKeyedStorage 
> interfaces, and because of 2 and 3, we may want to add methods to the 
> WindowedStorage interface so that the implementation of WindowedOperator can 
> notify the storage of checkpointing, recovering and committing of a window.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Apache Apex Malhar Release 3.5.0 (RC1)

2016-08-29 Thread Shubham Pathak
+1


   - File Integrity : *OK*
   - Existence of LICENSE, NOTICE, README.md and CHANGELOG.md files : *OK*
   - No unexpected binary files in the sources
   - mvn clean apache-rat:check verify -Dlicense.skip=false -Pall-modules
   install : *OK*
   - mvn verify -Papache-release -DskipTests: *OK*
   - Launched Pi Demo

Thanks,
Shubham

On Mon, Aug 29, 2016 at 2:33 PM, Thomas Weise 
wrote:

> The documentation is now also generated, after resolving an issue with the
> release instructions:
>
> http://apex.apache.org/docs/malhar-3.5/
>
>
> On Sun, Aug 28, 2016 at 9:59 PM, Thomas Weise 
> wrote:
>
> > Dear Community,
> >
> > Please vote on the following Apache Apex Malhar 3.5.0 release candidate.
> >
> > This is a source release with binary artifacts published to Maven.
> >
> > This release is based on Apex Core 3.4 and comes with 61 resolved issues.
> >
> > The release advances the high level stream API to support stateful
> > transformations with Beam style windowing semantics. The demo package has
> > examples for usage of the API. There are also important improvements to
> > underlying operator state management components, which are functional
> first
> > cut and will be enhanced in upcoming releases, such as WindowOperator,
> > spillable collections and incremental state saving.
> >
> > The release also adds several new operators.
> >
> > List of all issues fixed: https://s.apache.org/5vQi
> >
> > Staging directory (new dist directories don't have access sorted out
> yet):
> > https://dist.apache.org/repos/dist/dev/apex/apache-apex-
> malhar-3.5.0-RC1/
> > Source zip:
> > https://dist.apache.org/repos/dist/dev/apex/apache-apex-malh
> > ar-3.5.0-RC1/apache-apex-malhar-3.5.0-source-release.zip
> > Source tar.gz:
> > https://dist.apache.org/repos/dist/dev/apex/apache-apex-malh
> > ar-3.5.0-RC1/apache-apex-malhar-3.5.0-source-release.tar.gz
> > Maven staging repository:
> > https://repository.apache.org/content/repositories/orgapacheapex-1016/
> >
> > Git source:
> > https://git-wip-us.apache.org/repos/asf?p=apex-malhar.git;a=
> > commit;h=refs/tags/v3.5.0-RC1
> >  (commit: f96f0025f2bc27dff79dd95e9f88d7a43bba6c41)
> >
> > PGP key:
> > http://pgp.mit.edu:11371/pks/lookup?op=vindex=t...@apache.org
> > KEYS file:
> > https://dist.apache.org/repos/dist/release/apex/KEYS
> >
> > More information at:
> > http://apex.apache.org
> >
> > Please try the release and vote; vote will be open for at least 72 hours.
> >
> > [ ] +1 approve (and what verification was done)
> > [ ] -1 disapprove (and reason why)
> >
> > http://www.apache.org/foundation/voting.html
> >
> > How to verify release candidate:
> >
> > http://apex.apache.org/verification.html
> >
> > Thanks,
> > Thomas
> >
>


Re: [VOTE] Apache Apex Malhar Release 3.5.0 (RC1)

2016-08-29 Thread Siyuan Hua
+1

Checked for
- File integration
- Source code verification
- Check for compilation and license
- Run pi demo for 10 min


On Sun, Aug 28, 2016 at 9:59 PM, Thomas Weise 
wrote:

> Dear Community,
>
> Please vote on the following Apache Apex Malhar 3.5.0 release candidate.
>
> This is a source release with binary artifacts published to Maven.
>
> This release is based on Apex Core 3.4 and comes with 61 resolved issues.
>
> The release advances the high level stream API to support stateful
> transformations with Beam style windowing semantics. The demo package has
> examples for usage of the API. There are also important improvements to
> underlying operator state management components, which are functional first
> cut and will be enhanced in upcoming releases, such as WindowOperator,
> spillable collections and incremental state saving.
>
> The release also adds several new operators.
>
> List of all issues fixed: https://s.apache.org/5vQi
>
> Staging directory (new dist directories don't have access sorted out yet):
> https://dist.apache.org/repos/dist/dev/apex/apache-apex-malhar-3.5.0-RC1/
> Source zip:
> https://dist.apache.org/repos/dist/dev/apex/apache-apex-malh
> ar-3.5.0-RC1/apache-apex-malhar-3.5.0-source-release.zip
> Source tar.gz:
> https://dist.apache.org/repos/dist/dev/apex/apache-apex-malh
> ar-3.5.0-RC1/apache-apex-malhar-3.5.0-source-release.tar.gz
> Maven staging repository:
> https://repository.apache.org/content/repositories/orgapacheapex-1016/
>
> Git source:
> https://git-wip-us.apache.org/repos/asf?p=apex-malhar.git;a=
> commit;h=refs/tags/v3.5.0-RC1
>  (commit: f96f0025f2bc27dff79dd95e9f88d7a43bba6c41)
>
> PGP key:
> http://pgp.mit.edu:11371/pks/lookup?op=vindex=t...@apache.org
> KEYS file:
> https://dist.apache.org/repos/dist/release/apex/KEYS
>
> More information at:
> http://apex.apache.org
>
> Please try the release and vote; vote will be open for at least 72 hours.
>
> [ ] +1 approve (and what verification was done)
> [ ] -1 disapprove (and reason why)
>
> http://www.apache.org/foundation/voting.html
>
> How to verify release candidate:
>
> http://apex.apache.org/verification.html
>
> Thanks,
> Thomas
>


[jira] [Commented] (APEXMALHAR-2207) JsonFormatterTest application test should check for presence of expected results

2016-08-29 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447547#comment-15447547
 ] 

Thomas Weise commented on APEXMALHAR-2207:
--

cloneDAG performs the same serialization that would occur when launching an 
application.

Test code is not something that belongs into embedded cluster. IMO the more 
appropriate place for those are shared utilities in the test package. There are 
additional considerations unrelated and beyond this JIRA, will start an email 
thread for those.


> JsonFormatterTest application test should check for presence of expected 
> results
> 
>
> Key: APEXMALHAR-2207
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2207
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Yogi Devendra
>Assignee: shubham pathak
>Priority: Minor
>
> Implement proper assertions and test termination. Remove all console outputs 
> and console stream manipulation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2207) JsonFormatterTest application test should check for presence of expected results

2016-08-29 Thread Sandesh (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447482#comment-15447482
 ] 

Sandesh commented on APEXMALHAR-2207:
-

StramLocalCluster internally calls

cloneLogicalPlan(dag); ( StramLocalCluster.java line:280 )

We can add extra check to compare the output of that call to original DAG, this 
will eliminate boilerplate tests.

Also, cloneDAG, uses Java serialization, so it won't test run time 
Serialization issues that Kryo might face.

> JsonFormatterTest application test should check for presence of expected 
> results
> 
>
> Key: APEXMALHAR-2207
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2207
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Yogi Devendra
>Assignee: shubham pathak
>Priority: Minor
>
> Implement proper assertions and test termination. Remove all console outputs 
> and console stream manipulation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2207) JsonFormatterTest application test should check for presence of expected results

2016-08-29 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447376#comment-15447376
 ] 

Thomas Weise commented on APEXMALHAR-2207:
--

Yes, cloneDAG performs the serialization of the logical plan. What other issues 
are you expecting to catch by running the embedded cluster?

In any case, please ensure the following:
- no console output when run as automated test (see high level API tests how 
that can be done
- no excessive logging
- tight assertions

The tests should be reliable in the CLI. Instead of printing and logging a lot 
of noise, we would like to have a lot of information when tests fail only.
 

> JsonFormatterTest application test should check for presence of expected 
> results
> 
>
> Key: APEXMALHAR-2207
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2207
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Yogi Devendra
>Assignee: shubham pathak
>Priority: Minor
>
> Implement proper assertions and test termination. Remove all console outputs 
> and console stream manipulation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (APEXMALHAR-2207) JsonFormatterTest application test should check for presence of expected results

2016-08-29 Thread shubham pathak (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447277#comment-15447277
 ] 

shubham pathak edited comment on APEXMALHAR-2207 at 8/29/16 10:42 PM:
--

Application test is included to catch serialization issues if any when the 
operator is part of the application. To remove console output, will test for 
serialization using approach followed in 
https://github.com/apache/apex-malhar/blob/master/demos/twitter/src/test/java/com/datatorrent/demos/twitter/TwitterDumpApplicationTest.java


was (Author: shubhamp):
Application test is included to catch any serialization issues if any when the 
operator is part of the application. To remove console output, will test for 
serialization using approach followed in 
https://github.com/apache/apex-malhar/blob/master/demos/twitter/src/test/java/com/datatorrent/demos/twitter/TwitterDumpApplicationTest.java

> JsonFormatterTest application test should check for presence of expected 
> results
> 
>
> Key: APEXMALHAR-2207
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2207
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Yogi Devendra
>Assignee: shubham pathak
>Priority: Minor
>
> Implement proper assertions and test termination. Remove all console outputs 
> and console stream manipulation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2207) JsonFormatterTest application test should check for presence of expected results

2016-08-29 Thread shubham pathak (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447277#comment-15447277
 ] 

shubham pathak commented on APEXMALHAR-2207:


Application test is included to catch any serialization issues if any when the 
operator is part of the application. To remove console output, will test for 
serialization using approach followed in 
https://github.com/apache/apex-malhar/blob/master/demos/twitter/src/test/java/com/datatorrent/demos/twitter/TwitterDumpApplicationTest.java

> JsonFormatterTest application test should check for presence of expected 
> results
> 
>
> Key: APEXMALHAR-2207
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2207
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Yogi Devendra
>Assignee: shubham pathak
>Priority: Minor
>
> Implement proper assertions and test termination. Remove all console outputs 
> and console stream manipulation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [jira] [Commented] (APEXCORE-514) Apache Apex website update

2016-08-29 Thread Kartavya Jain
I am okay with Roadmap.



On Tue, Aug 23, 2016 at 4:04 PM, Thomas Weise 
wrote:

> I think the common and appropriate term for this is roadmap.
>
> On Tue, Aug 23, 2016 at 3:41 PM, Ashwin Chandra Putta <
> ashwinchand...@gmail.com> wrote:
>
> > We already discussed this in another thread and the consensus was to call
> > it "Initiatives" and add it to the top nav bar.
> >
> > Regards,
> > Ashwin.
> >
> > On Tue, Aug 23, 2016 at 3:35 PM, Kartavya Jain  >
> > wrote:
> >
> > > We can place the "Roadmap" tab on the main nav bar so it easily visible
> > and
> > > accessible. Currently, it's buried under the Community page and not
> easy
> > to
> > > find.
> > >
> > > Let me know if there are any concerns with that.
> > >
> > >
> > >
> > > On Tue, Aug 23, 2016 at 2:52 PM, Thomas Weise (JIRA) 
> > > wrote:
> > >
> > > >
> > > > [ https://issues.apache.org/jira/browse/APEXCORE-514?page=
> > > > com.atlassian.jira.plugin.system.issuetabpanels:comment-
> > > > tabpanel=15433686#comment-15433686 ]
> > > >
> > > > Thomas Weise commented on APEXCORE-514:
> > > > ---
> > > >
> > > > There is already a roadmap page, so I would suggest to add any
> related
> > > > info for 5.) there and make it easier to navigate to.
> > > >
> > > > http://apex.apache.org/roadmap.html
> > > >
> > > >
> > > > > Apache Apex website update
> > > > > --
> > > > >
> > > > > Key: APEXCORE-514
> > > > > URL: https://issues.apache.org/
> > > jira/browse/APEXCORE-514
> > > > > Project: Apache Apex Core
> > > > >  Issue Type: Improvement
> > > > >Reporter: Michelle Xiao
> > > > >Assignee: Michelle Xiao
> > > > >  Labels: features
> > > > >
> > > > > Update Apache Apex website:
> > > > > 1. add "Powered By Apex" between Community and Docs on the top
> > > > navigation bar;
> > > > > 2. add another "Download" after Github under Apache Apex section of
> > > Home
> > > > page, same color with existing Download;
> > > > > 3. replace "Docs" with Documentation, add top navigation tabs
> linking
> > > to
> > > > each section in this page, navigation tab includes all section titles
> > in
> > > > this page;
> > > > > 4. add top nav bar in Community page, drop down list with all
> section
> > > > titles in this page;
> > > > > 5 add "Initiatives" after Home linking to a new initiatives page.
> > > >
> > > >
> > > >
> > > > --
> > > > This message was sent by Atlassian JIRA
> > > > (v6.3.4#6332)
> > > >
> > >
> > >
> > >
> > > --
> > > Kartavya Jain
> > > Product Marketing Manager
> > > DataTorrent
> > >
> > > karta...@datatorrent.com
> > >
> >
> >
> >
> > --
> >
> > Regards,
> > Ashwin.
> >
>



-- 
Kartavya Jain
Product Marketing Manager
DataTorrent

karta...@datatorrent.com


Re: [VOTE] Apache Apex Malhar Release 3.5.0 (RC1)

2016-08-29 Thread Thomas Weise
The documentation is now also generated, after resolving an issue with the
release instructions:

http://apex.apache.org/docs/malhar-3.5/


On Sun, Aug 28, 2016 at 9:59 PM, Thomas Weise 
wrote:

> Dear Community,
>
> Please vote on the following Apache Apex Malhar 3.5.0 release candidate.
>
> This is a source release with binary artifacts published to Maven.
>
> This release is based on Apex Core 3.4 and comes with 61 resolved issues.
>
> The release advances the high level stream API to support stateful
> transformations with Beam style windowing semantics. The demo package has
> examples for usage of the API. There are also important improvements to
> underlying operator state management components, which are functional first
> cut and will be enhanced in upcoming releases, such as WindowOperator,
> spillable collections and incremental state saving.
>
> The release also adds several new operators.
>
> List of all issues fixed: https://s.apache.org/5vQi
>
> Staging directory (new dist directories don't have access sorted out yet):
> https://dist.apache.org/repos/dist/dev/apex/apache-apex-malhar-3.5.0-RC1/
> Source zip:
> https://dist.apache.org/repos/dist/dev/apex/apache-apex-malh
> ar-3.5.0-RC1/apache-apex-malhar-3.5.0-source-release.zip
> Source tar.gz:
> https://dist.apache.org/repos/dist/dev/apex/apache-apex-malh
> ar-3.5.0-RC1/apache-apex-malhar-3.5.0-source-release.tar.gz
> Maven staging repository:
> https://repository.apache.org/content/repositories/orgapacheapex-1016/
>
> Git source:
> https://git-wip-us.apache.org/repos/asf?p=apex-malhar.git;a=
> commit;h=refs/tags/v3.5.0-RC1
>  (commit: f96f0025f2bc27dff79dd95e9f88d7a43bba6c41)
>
> PGP key:
> http://pgp.mit.edu:11371/pks/lookup?op=vindex=t...@apache.org
> KEYS file:
> https://dist.apache.org/repos/dist/release/apex/KEYS
>
> More information at:
> http://apex.apache.org
>
> Please try the release and vote; vote will be open for at least 72 hours.
>
> [ ] +1 approve (and what verification was done)
> [ ] -1 disapprove (and reason why)
>
> http://www.apache.org/foundation/voting.html
>
> How to verify release candidate:
>
> http://apex.apache.org/verification.html
>
> Thanks,
> Thomas
>


[GitHub] apex-site pull request #48: Fix documentation deployment steps.

2016-08-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/apex-site/pull/48


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] apex-site pull request #48: Fix documentation deployment steps.

2016-08-29 Thread tweise
GitHub user tweise opened a pull request:

https://github.com/apache/apex-site/pull/48

Fix documentation deployment steps.

@vrozov please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tweise/apex-site master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-site/pull/48.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #48


commit 693634e990794c42e19fa046ccb26116becee278
Author: Thomas Weise 
Date:   2016-08-29T21:12:21Z

Fix documentation deployment steps.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (APEXMALHAR-2214) Use tuple internally so it can carry over the window semantics

2016-08-29 Thread Siyuan Hua (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyuan Hua updated APEXMALHAR-2214:
---
Assignee: Siyuan Hua

> Use tuple internally so it can carry over the window semantics
> --
>
> Key: APEXMALHAR-2214
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2214
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Siyuan Hua
>Assignee: Siyuan Hua
>
> If some windowed operation followed by some unwidowed operation. We need to 
> carry the window semantic over the dag. That's why even the unwindowed 
> operations are on the value of a tuple, we still need to use Tuple type in 
> network. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2214) Use tuple internally so it can carry over the window semantics

2016-08-29 Thread Siyuan Hua (JIRA)
Siyuan Hua created APEXMALHAR-2214:
--

 Summary: Use tuple internally so it can carry over the window 
semantics
 Key: APEXMALHAR-2214
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2214
 Project: Apache Apex Malhar
  Issue Type: Bug
Reporter: Siyuan Hua


If some windowed operation followed by some unwidowed operation. We need to 
carry the window semantic over the dag. That's why even the unwindowed 
operations are on the value of a tuple, we still need to use Tuple type in 
network. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2213) Refactor Stream API to make it easier to specify keys for keyed transformation

2016-08-29 Thread Siyuan Hua (JIRA)
Siyuan Hua created APEXMALHAR-2213:
--

 Summary: Refactor Stream API to make it easier to specify keys for 
keyed transformation 
 Key: APEXMALHAR-2213
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2213
 Project: Apache Apex Malhar
  Issue Type: Bug
Reporter: Siyuan Hua
Assignee: Siyuan Hua


Right now, the keyed transformation enforce you to specify the MapToKeyVal 
interface. It's not flexible if the output of upstream transformation is 
already KeyaluePair or people want to specify an explicit Map before keyed 
operation.
 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #389: Apex malhar 2143 - Evaluate and retire lib/ma...

2016-08-29 Thread prasannapramod
GitHub user prasannapramod opened a pull request:

https://github.com/apache/apex-malhar/pull/389

Apex malhar 2143 - Evaluate and retire lib/math, lib/algo, and 
lib/streamquery operators

@davidyan74 @PramodSSImmaneni @tweise  please review . I am working on 
fixing the functionality of the operators like "BottomNUniqueMap" , 
"InsertSort", "TopN", "UniqueCounter", "MarginKeyVal" . Also, I am planning to 
reimplement some operators like "BottomNMap" using the windows operator and 
managed state .

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/prasannapramod/apex-malhar ApexMalhar-2143

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/389.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #389


commit 2e76e56c6066bf25895655ec2a7d694c68853633
Author: Lakshmi Prasanna Velineni 
Date:   2016-08-25T03:54:53Z

Updated algo & working on math operators

commit 1440b13748c83029f107be244192c9c66daeb367
Author: Lakshmi Prasanna Velineni 
Date:   2016-08-29T18:23:50Z

worked on algo & streamquery and related issues




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (APEXMALHAR-2212) Tuple codec to make application build by High-level api more efficient

2016-08-29 Thread Siyuan Hua (JIRA)
Siyuan Hua created APEXMALHAR-2212:
--

 Summary: Tuple codec to make application build by High-level api 
more efficient
 Key: APEXMALHAR-2212
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2212
 Project: Apache Apex Malhar
  Issue Type: Bug
Reporter: Siyuan Hua


We should have a tuple codec to make the application build by high-level api 
much faster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-516) StramLocalCluster should always use loopback address for buffer server location

2016-08-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15446637#comment-15446637
 ] 

ASF GitHub Bot commented on APEXCORE-516:
-

GitHub user vrozov opened a pull request:

https://github.com/apache/apex-core/pull/376

APEXCORE-516 - StramLocalCluster should always use loopback address for 
buffer server location

@tweise Please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vrozov/apex-core APEXCORE-516

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-core/pull/376.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #376






> StramLocalCluster should always use loopback address for buffer server 
> location
> ---
>
> Key: APEXCORE-516
> URL: https://issues.apache.org/jira/browse/APEXCORE-516
> Project: Apache Apex Core
>  Issue Type: Improvement
> Environment: Jenkins Apache (builds.apache.org using beam label)
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> With incorrectly configured environments where InetAddress.getLocalHost() 
> returns resolvable address, but that address is not among active interfaces 
> of the host where buffer server is deployed, neither buffer server publish or 
> subscriber will be able to connect to the buffer server. In most cases, this 
> issue should be solved by properly configuring the cluster or node where 
> buffer server is deployed. This fix should address only a case where Apex is 
> deployed into an automatically provisioned environment for running unit and 
> integration tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-core pull request #376: APEXCORE-516 - StramLocalCluster should always ...

2016-08-29 Thread vrozov
GitHub user vrozov opened a pull request:

https://github.com/apache/apex-core/pull/376

APEXCORE-516 - StramLocalCluster should always use loopback address for 
buffer server location

@tweise Please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vrozov/apex-core APEXCORE-516

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-core/pull/376.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #376






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2210) Improve Join operator in malhar library for different types of joins

2016-08-29 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15446530#comment-15446530
 ] 

Thomas Weise commented on APEXMALHAR-2210:
--

We already have different join operators in the library. We also have needs 
regarding flexible windowing and related work for join in progress. Please have 
a look and start a discussion on the dev list so we arrive at a plan that fits 
the bigger picture.

> Improve Join operator in malhar library for different types of joins
> 
>
> Key: APEXMALHAR-2210
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2210
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: Chinmay Kolhatkar
>Assignee: Chaitanya
>
> 1. Improve version of Join operator in malhar for performance and stability
> 2. Add support for other types of join in the operator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2206) Some Application tests taking too long

2016-08-29 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15446471#comment-15446471
 ] 

Thomas Weise commented on APEXMALHAR-2206:
--

Yogi, I don't think we need to see this in the change log. I would also suggest 
that in the future we commit fixes to unreleased work against the original JIRA.

> Some Application tests taking too long
> --
>
> Key: APEXMALHAR-2206
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2206
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Bhupesh Chawda
>Assignee: Yogi Devendra
>Priority: Minor
>
> Some Application Tests seems to be running for a long time.
> Additionally the size of the target folder increases too much ~ 2GB



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #386: APEXMALHAR-2205 State management benchmark

2016-08-29 Thread brightchen
GitHub user brightchen reopened a pull request:

https://github.com/apache/apex-malhar/pull/386

APEXMALHAR-2205 State management benchmark



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2205

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/386.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #386


commit 7be771bc5f58322fd8ddaf6f8caa8bc799791f9a
Author: brightchen 
Date:   2016-08-26T23:09:12Z

APEXMALHAR-2205 State management benchmark




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2205) State management benchmark

2016-08-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15446398#comment-15446398
 ] 

ASF GitHub Bot commented on APEXMALHAR-2205:


Github user brightchen closed the pull request at:

https://github.com/apache/apex-malhar/pull/386


> State management benchmark
> --
>
> Key: APEXMALHAR-2205
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2205
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: bright chen
>Assignee: bright chen
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #386: APEXMALHAR-2205 State management benchmark

2016-08-29 Thread brightchen
Github user brightchen closed the pull request at:

https://github.com/apache/apex-malhar/pull/386


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [VOTE] Apache Apex Malhar Release 3.5.0 (RC1)

2016-08-29 Thread Yogi Devendra
+1

Checked for
- Signatures
- No binary files
- Rat check and build passed



~ Yogi

On 29 August 2016 at 10:29, Thomas Weise  wrote:

> Dear Community,
>
> Please vote on the following Apache Apex Malhar 3.5.0 release candidate.
>
> This is a source release with binary artifacts published to Maven.
>
> This release is based on Apex Core 3.4 and comes with 61 resolved issues.
>
> The release advances the high level stream API to support stateful
> transformations with Beam style windowing semantics. The demo package has
> examples for usage of the API. There are also important improvements to
> underlying operator state management components, which are functional first
> cut and will be enhanced in upcoming releases, such as WindowOperator,
> spillable collections and incremental state saving.
>
> The release also adds several new operators.
>
> List of all issues fixed: https://s.apache.org/5vQi
>
> Staging directory (new dist directories don't have access sorted out yet):
> https://dist.apache.org/repos/dist/dev/apex/apache-apex-malhar-3.5.0-RC1/
> Source zip:
> https://dist.apache.org/repos/dist/dev/apex/apache-apex-malh
> ar-3.5.0-RC1/apache-apex-malhar-3.5.0-source-release.zip
> Source tar.gz:
> https://dist.apache.org/repos/dist/dev/apex/apache-apex-malh
> ar-3.5.0-RC1/apache-apex-malhar-3.5.0-source-release.tar.gz
> Maven staging repository:
> https://repository.apache.org/content/repositories/orgapacheapex-1016/
>
> Git source:
> https://git-wip-us.apache.org/repos/asf?p=apex-malhar.git;a=
> commit;h=refs/tags/v3.5.0-RC1
>  (commit: f96f0025f2bc27dff79dd95e9f88d7a43bba6c41)
>
> PGP key:
> http://pgp.mit.edu:11371/pks/lookup?op=vindex=t...@apache.org
> KEYS file:
> https://dist.apache.org/repos/dist/release/apex/KEYS
>
> More information at:
> http://apex.apache.org
>
> Please try the release and vote; vote will be open for at least 72 hours.
>
> [ ] +1 approve (and what verification was done)
> [ ] -1 disapprove (and reason why)
>
> http://www.apache.org/foundation/voting.html
>
> How to verify release candidate:
>
> http://apex.apache.org/verification.html
>
> Thanks,
> Thomas
>


[jira] [Commented] (APEXMALHAR-2206) Some Application tests taking too long

2016-08-29 Thread Yogi Devendra (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15445329#comment-15445329
 ] 

Yogi Devendra commented on APEXMALHAR-2206:
---

Should we mark fix version as 3.5.0 for this ticket?

> Some Application tests taking too long
> --
>
> Key: APEXMALHAR-2206
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2206
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Bhupesh Chawda
>Assignee: Yogi Devendra
>Priority: Minor
>
> Some Application Tests seems to be running for a long time.
> Additionally the size of the target folder increases too much ~ 2GB



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)