[jira] [Resolved] (APEXCORE-251) Journal output stream is null error message

2016-11-16 Thread Tushar Gosavi (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tushar Gosavi resolved APEXCORE-251.

   Resolution: Fixed
Fix Version/s: 3.5.0

> Journal output stream is null error message
> ---
>
> Key: APEXCORE-251
> URL: https://issues.apache.org/jira/browse/APEXCORE-251
> Project: Apache Apex Core
>  Issue Type: Task
>Reporter: Chetan Narsude
>Assignee: Vlad Rozov
> Fix For: 3.5.0
>
>
> A simple checkout and test  run prints that message  a gazillion times as 
> WARN.
> {code}
> 2015-11-07 07:30:05,047 [master] WARN  stram.Journal write - Journal output 
> stream is null. Skipping write to the WAL.
> {code}
> Looks like it could be a debug message or something. Make it less verbose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-251) Journal output stream is null error message

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15672899#comment-15672899
 ] 

ASF GitHub Bot commented on APEXCORE-251:
-

Github user asfgit closed the pull request at:

https://github.com/apache/apex-core/pull/419


> Journal output stream is null error message
> ---
>
> Key: APEXCORE-251
> URL: https://issues.apache.org/jira/browse/APEXCORE-251
> Project: Apache Apex Core
>  Issue Type: Task
>Reporter: Chetan Narsude
>Assignee: Vlad Rozov
>
> A simple checkout and test  run prints that message  a gazillion times as 
> WARN.
> {code}
> 2015-11-07 07:30:05,047 [master] WARN  stram.Journal write - Journal output 
> stream is null. Skipping write to the WAL.
> {code}
> Looks like it could be a debug message or something. Make it less verbose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-core pull request #419: APEXCORE-251 - Suppress "Journal output stream ...

2016-11-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/apex-core/pull/419


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Visitor API for DAG

2016-11-16 Thread Tushar Gosavi
Hi All,

How about adding visitor like API for DAG in Apex, and an api to
register visitor for the DAG.
Possible use cases are
-  Validator visitor which could validate the dag
-  Visitor to inject properties/attribute in the operator/streams from
some external sources.
-  Platform does not support validation of individual operators.
developer could write a validator visitor which would call validate
function of operator if it implements Validator interface.
- generate output schema based on operator config and input schema,
and set the schema on output stream.

Sample API :

dag.registerVisitor(DAGVisitor visitor);

Call order of visitorFunctions.
- preVisitDAG(Attributes) // dag attributes
  for all operators
  - visitOperator(OperatorMeta meta) // access to operator, name,
attributes, properties
 ports
  - visitStream(StreamMeta meta) // access to
stream/name/attributes/properties/ports
- postVisitDAG()

Regards,
-Tushar.


[jira] [Created] (APEXMALHAR-2344) Initialize the list of FieldInfo in JDBCPollInput operator from properties.xml

2016-11-16 Thread Hitesh Kapoor (JIRA)
Hitesh Kapoor created APEXMALHAR-2344:
-

 Summary: Initialize the list of FieldInfo in JDBCPollInput 
operator from properties.xml
 Key: APEXMALHAR-2344
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2344
 Project: Apache Apex Malhar
  Issue Type: Improvement
Reporter: Hitesh Kapoor
Assignee: Hitesh Kapoor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #501: APEXMALHAR-2340 code changes to initialize th...

2016-11-16 Thread Hitesh-Scorpio
GitHub user Hitesh-Scorpio opened a pull request:

https://github.com/apache/apex-malhar/pull/501

APEXMALHAR-2340 code changes to initialize the list of JdbcFieldInfo …

…in JdbcPOJOInsertOutput from properties.xml @devtagare @bhupeshchawda 
please review the code changes

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Hitesh-Scorpio/apex-malhar 
APEXMALHAR-2291_JdbcExactlyOnce

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/501.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #501


commit 424b97f59de8149ac23644dacaf4aab6b3d52339
Author: Hitesh-Scorpio 
Date:   2016-11-17T05:19:20Z

APEXMALHAR-2340 code changes to initialize the list of JdbcFieldInfo in 
JdbcPOJOInsertOutput from properties.xml




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Proposal for apex/malhar extensions

2016-11-16 Thread Pramod Immaneni
Yes, I think it could be useful for core as well.

On Wed, Nov 16, 2016 at 11:19 AM, Chinmay Kolhatkar 
wrote:

> @sanjay, yes we can define the process around this.
>
> @pramod, Well, I said apex-malhar because the extensions can be operators
> and and other plugins to apex engine.
> Do you see the use of this for apex-core as well?
>
> -Chinmay.
>
>
> On Wed, Nov 16, 2016 at 7:24 PM, Pramod Immaneni 
> wrote:
>
> > So it would be like a yellow pages for the apex ecosystem. Sounds like a
> > good idea. Why limit it to malhar?
> >
> > On Wed, Nov 16, 2016 at 3:17 AM, Chinmay Kolhatkar 
> > wrote:
> >
> > > Dear Community,
> > >
> > > This is in relation to malhar cleanup work that is ongoing.
> > >
> > > In one of the talks during Apache BigData Europe, I got to know about
> > > Spark-Packages (https://spark-packages.org/) (I believe lot of you
> must
> > be
> > > aware of it).
> > > Spark package is basically functionality over and above and using Spark
> > > core functionality. The spark packages can initially present in
> someone's
> > > public repository and one could register that with
> > > https://spark-packages.org/ and later on as it matures and finds more
> > use,
> > > it gets consumed in mainstream Spark repository and releases.
> > >
> > > I found this idea quite interesting to keep our apex-malhar releases
> > > cleaner.
> > >
> > > One could have extension to apex-malhar in their own repository and
> just
> > > register itself with Apache Apex. As it matures and find more and more
> > use
> > > we can consume that in mainstream releases.
> > > Advantages to this are multiple:
> > > 1. The entry point for registering extensions with Apache Apex can be
> > > minimal. This way we get more indirect contributions.
> > > 2. Faster way to add more feature in the project.
> > > 3. We keep our releases cleaner.
> > > 4. One could progress on feature-set faster balancing both Apache Way
> as
> > > well as their own Enterprise Interests.
> > >
> > > Please share your thoughts on this.
> > >
> > > Thanks,
> > > Chinmay.
> > >
> >
>


[jira] [Commented] (APEXMALHAR-2343) Count Accumulation should only increase one for each tuple

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671464#comment-15671464
 ] 

ASF GitHub Bot commented on APEXMALHAR-2343:


GitHub user brightchen opened a pull request:

https://github.com/apache/apex-malhar/pull/500

APEXMALHAR-2343 #resolve #comment Count Accumulation should only incr…

…ease one for each tuple

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2343

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/500.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #500






> Count Accumulation should only increase one for each tuple
> --
>
> Key: APEXMALHAR-2343
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2343
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: bright chen
>Assignee: bright chen
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #500: APEXMALHAR-2343 #resolve #comment Count Accum...

2016-11-16 Thread brightchen
GitHub user brightchen opened a pull request:

https://github.com/apache/apex-malhar/pull/500

APEXMALHAR-2343 #resolve #comment Count Accumulation should only incr…

…ease one for each tuple

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2343

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/500.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #500






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2343) Count Accumulation should only increase one for each tuple

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671408#comment-15671408
 ] 

ASF GitHub Bot commented on APEXMALHAR-2343:


Github user brightchen closed the pull request at:

https://github.com/apache/apex-malhar/pull/499


> Count Accumulation should only increase one for each tuple
> --
>
> Key: APEXMALHAR-2343
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2343
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: bright chen
>Assignee: bright chen
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #499: APEXMALHAR-2343 #resolve #comment Count Accum...

2016-11-16 Thread brightchen
Github user brightchen closed the pull request at:

https://github.com/apache/apex-malhar/pull/499


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Proposal for apex/malhar extensions

2016-11-16 Thread Sandesh Hegde
Do we have any projects today that can benefit from this setup?
Earlier in this mail thread, we discussed "contrib (low bar) & graduation"
in Malhar, that is not sufficient?

On Wed, Nov 16, 2016 at 11:19 AM Chinmay Kolhatkar 
wrote:

> @sanjay, yes we can define the process around this.
>
> @pramod, Well, I said apex-malhar because the extensions can be operators
> and and other plugins to apex engine.
> Do you see the use of this for apex-core as well?
>
> -Chinmay.
>
>
> On Wed, Nov 16, 2016 at 7:24 PM, Pramod Immaneni 
> wrote:
>
> > So it would be like a yellow pages for the apex ecosystem. Sounds like a
> > good idea. Why limit it to malhar?
> >
> > On Wed, Nov 16, 2016 at 3:17 AM, Chinmay Kolhatkar 
> > wrote:
> >
> > > Dear Community,
> > >
> > > This is in relation to malhar cleanup work that is ongoing.
> > >
> > > In one of the talks during Apache BigData Europe, I got to know about
> > > Spark-Packages (https://spark-packages.org/) (I believe lot of you
> must
> > be
> > > aware of it).
> > > Spark package is basically functionality over and above and using Spark
> > > core functionality. The spark packages can initially present in
> someone's
> > > public repository and one could register that with
> > > https://spark-packages.org/ and later on as it matures and finds more
> > use,
> > > it gets consumed in mainstream Spark repository and releases.
> > >
> > > I found this idea quite interesting to keep our apex-malhar releases
> > > cleaner.
> > >
> > > One could have extension to apex-malhar in their own repository and
> just
> > > register itself with Apache Apex. As it matures and find more and more
> > use
> > > we can consume that in mainstream releases.
> > > Advantages to this are multiple:
> > > 1. The entry point for registering extensions with Apache Apex can be
> > > minimal. This way we get more indirect contributions.
> > > 2. Faster way to add more feature in the project.
> > > 3. We keep our releases cleaner.
> > > 4. One could progress on feature-set faster balancing both Apache Way
> as
> > > well as their own Enterprise Interests.
> > >
> > > Please share your thoughts on this.
> > >
> > > Thanks,
> > > Chinmay.
> > >
> >
>


[jira] [Commented] (APEXMALHAR-2343) Count Accumulation should only increase one for each tuple

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671373#comment-15671373
 ] 

ASF GitHub Bot commented on APEXMALHAR-2343:


GitHub user brightchen opened a pull request:

https://github.com/apache/apex-malhar/pull/499

APEXMALHAR-2343 #resolve #comment Count Accumulation should only incr…

…ease one for each tuple

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2343

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/499.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #499


commit 0f2d0e8d0df9db6361b7037c735030879392be61
Author: brightchen 
Date:   2016-11-16T19:29:46Z

APEXMALHAR-2343 #resolve #comment Count Accumulation should only increase 
one for each tuple




> Count Accumulation should only increase one for each tuple
> --
>
> Key: APEXMALHAR-2343
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2343
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: bright chen
>Assignee: bright chen
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #499: APEXMALHAR-2343 #resolve #comment Count Accum...

2016-11-16 Thread brightchen
GitHub user brightchen opened a pull request:

https://github.com/apache/apex-malhar/pull/499

APEXMALHAR-2343 #resolve #comment Count Accumulation should only incr…

…ease one for each tuple

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2343

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/499.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #499


commit 0f2d0e8d0df9db6361b7037c735030879392be61
Author: brightchen 
Date:   2016-11-16T19:29:46Z

APEXMALHAR-2343 #resolve #comment Count Accumulation should only increase 
one for each tuple




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Proposal for apex/malhar extensions

2016-11-16 Thread Chinmay Kolhatkar
@sanjay, yes we can define the process around this.

@pramod, Well, I said apex-malhar because the extensions can be operators
and and other plugins to apex engine.
Do you see the use of this for apex-core as well?

-Chinmay.


On Wed, Nov 16, 2016 at 7:24 PM, Pramod Immaneni 
wrote:

> So it would be like a yellow pages for the apex ecosystem. Sounds like a
> good idea. Why limit it to malhar?
>
> On Wed, Nov 16, 2016 at 3:17 AM, Chinmay Kolhatkar 
> wrote:
>
> > Dear Community,
> >
> > This is in relation to malhar cleanup work that is ongoing.
> >
> > In one of the talks during Apache BigData Europe, I got to know about
> > Spark-Packages (https://spark-packages.org/) (I believe lot of you must
> be
> > aware of it).
> > Spark package is basically functionality over and above and using Spark
> > core functionality. The spark packages can initially present in someone's
> > public repository and one could register that with
> > https://spark-packages.org/ and later on as it matures and finds more
> use,
> > it gets consumed in mainstream Spark repository and releases.
> >
> > I found this idea quite interesting to keep our apex-malhar releases
> > cleaner.
> >
> > One could have extension to apex-malhar in their own repository and just
> > register itself with Apache Apex. As it matures and find more and more
> use
> > we can consume that in mainstream releases.
> > Advantages to this are multiple:
> > 1. The entry point for registering extensions with Apache Apex can be
> > minimal. This way we get more indirect contributions.
> > 2. Faster way to add more feature in the project.
> > 3. We keep our releases cleaner.
> > 4. One could progress on feature-set faster balancing both Apache Way as
> > well as their own Enterprise Interests.
> >
> > Please share your thoughts on this.
> >
> > Thanks,
> > Chinmay.
> >
>


Re: Proposal for apex/malhar extensions

2016-11-16 Thread Pramod Immaneni
So it would be like a yellow pages for the apex ecosystem. Sounds like a
good idea. Why limit it to malhar?

On Wed, Nov 16, 2016 at 3:17 AM, Chinmay Kolhatkar 
wrote:

> Dear Community,
>
> This is in relation to malhar cleanup work that is ongoing.
>
> In one of the talks during Apache BigData Europe, I got to know about
> Spark-Packages (https://spark-packages.org/) (I believe lot of you must be
> aware of it).
> Spark package is basically functionality over and above and using Spark
> core functionality. The spark packages can initially present in someone's
> public repository and one could register that with
> https://spark-packages.org/ and later on as it matures and finds more use,
> it gets consumed in mainstream Spark repository and releases.
>
> I found this idea quite interesting to keep our apex-malhar releases
> cleaner.
>
> One could have extension to apex-malhar in their own repository and just
> register itself with Apache Apex. As it matures and find more and more use
> we can consume that in mainstream releases.
> Advantages to this are multiple:
> 1. The entry point for registering extensions with Apache Apex can be
> minimal. This way we get more indirect contributions.
> 2. Faster way to add more feature in the project.
> 3. We keep our releases cleaner.
> 4. One could progress on feature-set faster balancing both Apache Way as
> well as their own Enterprise Interests.
>
> Please share your thoughts on this.
>
> Thanks,
> Chinmay.
>


Re: Proposal for apex/malhar extensions

2016-11-16 Thread Sanjay Pujare
+1 for the idea. Will be good to describe the registration mechanism to be used.

On 11/16/16, 3:17 AM, "Chinmay Kolhatkar"  wrote:

Dear Community,

This is in relation to malhar cleanup work that is ongoing.

In one of the talks during Apache BigData Europe, I got to know about
Spark-Packages (https://spark-packages.org/) (I believe lot of you must be
aware of it).
Spark package is basically functionality over and above and using Spark
core functionality. The spark packages can initially present in someone's
public repository and one could register that with
https://spark-packages.org/ and later on as it matures and finds more use,
it gets consumed in mainstream Spark repository and releases.

I found this idea quite interesting to keep our apex-malhar releases
cleaner.

One could have extension to apex-malhar in their own repository and just
register itself with Apache Apex. As it matures and find more and more use
we can consume that in mainstream releases.
Advantages to this are multiple:
1. The entry point for registering extensions with Apache Apex can be
minimal. This way we get more indirect contributions.
2. Faster way to add more feature in the project.
3. We keep our releases cleaner.
4. One could progress on feature-set faster balancing both Apache Way as
well as their own Enterprise Interests.

Please share your thoughts on this.

Thanks,
Chinmay.





[jira] [Commented] (APEXMALHAR-2076) AbstractExactlyOnceKafkaOutputOperator didn't handle the orderless of tuples in a window

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671026#comment-15671026
 ] 

ASF GitHub Bot commented on APEXMALHAR-2076:


Github user brightchen closed the pull request at:

https://github.com/apache/apex-malhar/pull/294


> AbstractExactlyOnceKafkaOutputOperator didn't handle the orderless of tuples 
> in a window
> 
>
> Key: APEXMALHAR-2076
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2076
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: bright chen
>Assignee: bright chen
>
> The order of the tuples in the same window are not guaranteed in replay. 
> AbstractExactlyOnceKafkaOutputOperator's logic assume the replayed tuples 
> have same order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #294: APEXMALHAR-2076 #resolve #comment add Abstrac...

2016-11-16 Thread brightchen
Github user brightchen closed the pull request at:

https://github.com/apache/apex-malhar/pull/294


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2331) StateTracker#bucketAccessed should add bucket to bucketAccessTimes

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671022#comment-15671022
 ] 

ASF GitHub Bot commented on APEXMALHAR-2331:


Github user brightchen closed the pull request at:

https://github.com/apache/apex-malhar/pull/488


> StateTracker#bucketAccessed should add bucket to bucketAccessTimes
> --
>
> Key: APEXMALHAR-2331
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2331
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: bright chen
>Assignee: bright chen
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The bucket didn't add to the bucketAccessTimes, which cause lots of 
> BucketIdTimeWrapper instances created and added to bucketHeap



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #488: APEXMALHAR-2331 #resolve #comment StateTracke...

2016-11-16 Thread brightchen
Github user brightchen closed the pull request at:

https://github.com/apache/apex-malhar/pull/488


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Proposal for apex/malhar extensions

2016-11-16 Thread Amol Kekre
+1

Thks
Amol

On Wed, Nov 16, 2016 at 5:37 AM, AJAY GUPTA  wrote:

> +1
> This is a good idea.
>
> Ajay
>
> On Wed, Nov 16, 2016 at 4:47 PM, Chinmay Kolhatkar 
> wrote:
>
> > Dear Community,
> >
> > This is in relation to malhar cleanup work that is ongoing.
> >
> > In one of the talks during Apache BigData Europe, I got to know about
> > Spark-Packages (https://spark-packages.org/) (I believe lot of you must
> be
> > aware of it).
> > Spark package is basically functionality over and above and using Spark
> > core functionality. The spark packages can initially present in someone's
> > public repository and one could register that with
> > https://spark-packages.org/ and later on as it matures and finds more
> use,
> > it gets consumed in mainstream Spark repository and releases.
> >
> > I found this idea quite interesting to keep our apex-malhar releases
> > cleaner.
> >
> > One could have extension to apex-malhar in their own repository and just
> > register itself with Apache Apex. As it matures and find more and more
> use
> > we can consume that in mainstream releases.
> > Advantages to this are multiple:
> > 1. The entry point for registering extensions with Apache Apex can be
> > minimal. This way we get more indirect contributions.
> > 2. Faster way to add more feature in the project.
> > 3. We keep our releases cleaner.
> > 4. One could progress on feature-set faster balancing both Apache Way as
> > well as their own Enterprise Interests.
> >
> > Please share your thoughts on this.
> >
> > Thanks,
> > Chinmay.
> >
>


Re: Proposal for apex/malhar extensions

2016-11-16 Thread AJAY GUPTA
+1
This is a good idea.

Ajay

On Wed, Nov 16, 2016 at 4:47 PM, Chinmay Kolhatkar 
wrote:

> Dear Community,
>
> This is in relation to malhar cleanup work that is ongoing.
>
> In one of the talks during Apache BigData Europe, I got to know about
> Spark-Packages (https://spark-packages.org/) (I believe lot of you must be
> aware of it).
> Spark package is basically functionality over and above and using Spark
> core functionality. The spark packages can initially present in someone's
> public repository and one could register that with
> https://spark-packages.org/ and later on as it matures and finds more use,
> it gets consumed in mainstream Spark repository and releases.
>
> I found this idea quite interesting to keep our apex-malhar releases
> cleaner.
>
> One could have extension to apex-malhar in their own repository and just
> register itself with Apache Apex. As it matures and find more and more use
> we can consume that in mainstream releases.
> Advantages to this are multiple:
> 1. The entry point for registering extensions with Apache Apex can be
> minimal. This way we get more indirect contributions.
> 2. Faster way to add more feature in the project.
> 3. We keep our releases cleaner.
> 4. One could progress on feature-set faster balancing both Apache Way as
> well as their own Enterprise Interests.
>
> Please share your thoughts on this.
>
> Thanks,
> Chinmay.
>


Proposal for apex/malhar extensions

2016-11-16 Thread Chinmay Kolhatkar
Dear Community,

This is in relation to malhar cleanup work that is ongoing.

In one of the talks during Apache BigData Europe, I got to know about
Spark-Packages (https://spark-packages.org/) (I believe lot of you must be
aware of it).
Spark package is basically functionality over and above and using Spark
core functionality. The spark packages can initially present in someone's
public repository and one could register that with
https://spark-packages.org/ and later on as it matures and finds more use,
it gets consumed in mainstream Spark repository and releases.

I found this idea quite interesting to keep our apex-malhar releases
cleaner.

One could have extension to apex-malhar in their own repository and just
register itself with Apache Apex. As it matures and find more and more use
we can consume that in mainstream releases.
Advantages to this are multiple:
1. The entry point for registering extensions with Apache Apex can be
minimal. This way we get more indirect contributions.
2. Faster way to add more feature in the project.
3. We keep our releases cleaner.
4. One could progress on feature-set faster balancing both Apache Way as
well as their own Enterprise Interests.

Please share your thoughts on this.

Thanks,
Chinmay.


[jira] [Resolved] (APEXMALHAR-2325) Same block id is emitting from FSInputModule

2016-11-16 Thread Priyanka Gugale (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Priyanka Gugale resolved APEXMALHAR-2325.
-
   Resolution: Fixed
Fix Version/s: 3.6.0

> Same block id is emitting from FSInputModule
> 
>
> Key: APEXMALHAR-2325
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2325
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Chaitanya
>Assignee: Chaitanya
>Priority: Minor
> Fix For: 3.6.0
>
>
> Observation:  Mismatch the block size between the filesplitter and block 
> reader in FSInput Module. 
> Default block size in block reader = conf.getLong("fs.local.block.size" ) 
> i.e, Local file system block size.
> Default block size in filesplitter = fs.getDefaultBlockSize(File Path)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2325) Same block id is emitting from FSInputModule

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15670055#comment-15670055
 ] 

ASF GitHub Bot commented on APEXMALHAR-2325:


Github user asfgit closed the pull request at:

https://github.com/apache/apex-malhar/pull/479


> Same block id is emitting from FSInputModule
> 
>
> Key: APEXMALHAR-2325
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2325
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Chaitanya
>Assignee: Chaitanya
>Priority: Minor
>
> Observation:  Mismatch the block size between the filesplitter and block 
> reader in FSInput Module. 
> Default block size in block reader = conf.getLong("fs.local.block.size" ) 
> i.e, Local file system block size.
> Default block size in filesplitter = fs.getDefaultBlockSize(File Path)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #479: APEXMALHAR-2325 1) Set the file system defaul...

2016-11-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/apex-malhar/pull/479


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2178) Unnecessary byte array copy in KryoSerializableStreamCodec

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669910#comment-15669910
 ] 

ASF GitHub Bot commented on APEXMALHAR-2178:


Github user ambarishpande closed the pull request at:

https://github.com/apache/apex-malhar/pull/498


> Unnecessary byte array copy in KryoSerializableStreamCodec
> --
>
> Key: APEXMALHAR-2178
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2178
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>  Labels: newbie
>
> {noformat}
>   public Slice toByteArray(T info)
>   {
> ByteArrayOutputStream os = new ByteArrayOutputStream();
> Output output = new Output(os);
> kryo.writeClassAndObject(output, info);
> output.flush();
> return new Slice(os.toByteArray(), 0, os.toByteArray().length);
>   }
> {noformat}
> It is not necessary to call os.toByteArray().length as it will result in 
> duplicate copy of the byte array.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #498: APEXMALHAR-2178 Removed unnecessary byte arra...

2016-11-16 Thread ambarishpande
Github user ambarishpande closed the pull request at:

https://github.com/apache/apex-malhar/pull/498


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: (APEXMALHAR-2340) Initialize the list of JdbcFieldInfo in JdbcPOJOInsertOutput operator from properties.xml

2016-11-16 Thread Hitesh Kapoor
Thank you for the inputs.
I will go ahead with JSON based input for mapping.
For e.g. the user will initialize the list of JdbcFieldInfos from
properties.xml file as follows:


dt.operator.JdbcOutput.fieldInfosItem[0]

{
"sqlType": 0,
"coumnName":"customerName",
"pojoFieldExpression": "customerName",
"type":"STRING"
}


  

  
dt.operator.JdbcOutput.fieldInfosItem[1]

{
"sqlType": 0,
"coumnName":"customerPhone",
"pojoFieldExpression": "customerPhone",
"type":"STRING"
}


  


Regards,
Hitesh Kapoor


On Tue, Nov 15, 2016 at 11:41 PM, Sanjay Pujare 
wrote:

> +1 for standardized JSON based mapping/schema definitions
>
> On Mon, Nov 14, 2016 at 11:15 PM, Priyanka Gugale 
> wrote:
>
> > +1 for having json based input for mappings.
> >
> > -Priyanka
> >
> > On Mon, Nov 14, 2016 at 11:21 PM, Devendra Tagare <
> > devend...@datatorrent.com
> > > wrote:
> >
> > > Hi,
> > >
> > > CSV schemas formats are based on delimited schemas which are meant to
> be
> > > sequence sensitive ref : DelimitedSchema
> > >  > > contrib/src/main/java/com/datatorrent/contrib/parser/
> > DelimitedSchema.java>
> > > and
> > > don't have a notion of input to output field mappings.
> > >
> > > Field info mappings for output operators are typically are a of the
> form
> > -
> > > destFieldName:pojoFieldName:type/supportType and are not intended to
> be
> > > sequence sensitive.
> > >
> > > We can go with a JSON based structure which maps sources to
> destinations
> > > with their respective types,
> > >
> > > {
> > >   "destinationFieldName": "destination field name",
> > >   "destType" : "support type, type",
> > >   "srcFieldName" : "source pojo field name",
> > >   "srcType" : "support type, type",
> > >   "constraints" : "constraint expression"
> > > }
> > >
> > > Thanks,
> > > Dev
> > >
> > >
> > >
> > >
> > >
> > > Thanks,
> > > Dev
> > >
> > > On Mon, Nov 14, 2016 at 9:27 AM, Ashwin Chandra Putta <
> > > ashwinchand...@gmail.com> wrote:
> > >
> > > > Hitesh,
> > > >
> > > > We should standardize the schema definition across apex for
> individual
> > > > operators and tuple classes.
> > > >
> > > > I think you should be able to use schema definition for CSV parser
> > > without
> > > > the delimiter.
> > > >
> > > > Regards,
> > > > Ashwin.
> > > >
> > > > On Nov 14, 2016 2:45 AM, "Hitesh Kapoor" 
> > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > Currently in JdbcPOJOInsertOuput operator we cannot configure
> > > > JdbcFieldInfo
> > > > > via properties.xml and the user has to do the necessary coding in
> his
> > > > > application.
> > > > >
> > > > > To implement this improvement, the approach mentioned in
> > > > > http://docs.datatorrent.com/application_packages/#
> > operator-properties
> > > > > could
> > > > > be followed.
> > > > > Now we need to provide the user a format for specifying the value
> of
> > > > > fieldInfo.
> > > > >
> > > > > Kindly let me know which of the following is the best format to be
> > used
> > > > for
> > > > > this
> > > > > 1) CSV string (or any delimited string) with values for data
> members
> > of
> > > > > JdbcFieldInfo in a fixed sequence.
> > > > > 2) JSON format with appropriate mapping.
> > > > > 3) XML format with appropriate name tags and values.
> > > > >
> > > > > Regards,
> > > > > Hitesh
> > > > >
> > > >
> > >
> >
>