Re: JSON License and Apache Projects

2016-11-24 Thread Chinmay Kolhatkar
Yes... That's the mail.. There are couple if related conversations can be
seen here too:
https://lists.apache.org/list.html?legal-disc...@apache.org

I suggest we take a look at it and do the needful from our end too.

-Chinmay.


On Fri, Nov 25, 2016 at 10:15 AM, Amol Kekre  wrote:

> Chinmay,
> Is this the thread you were looking for?
>
> Thks
> Amol
>
> -- Forwarded message --
> From: Ted Dunning 
> Date: Thu, Nov 24, 2016 at 2:28 PM
> Subject: Re: JSON License and Apache Projects
> To: "gene...@incubator.apache.org" 
>
>
> Stephan,
>
> What you suggest should work (if you add another dependency to provide the
> needed classes).
>
> You have to be careful, however, because your consumers may expect to get
> the full json.org API.
>
> I would suggest that exclusions like this should only be used while your
> direct dependency still has the dependency on json.org. When they fix it,
> you can drop the exclusion and all will be good.
>
>
>
> On Thu, Nov 24, 2016 at 2:21 AM, Stephan Ewen  wrote:
>
> > Just to be on the safe side:
> >
> > If project X depends on another project Y that uses json.org (and thus
> > project X has json.org as a transitive dependency) is it sufficient to
> > exclude the transitive json.org dependency in the reference to project
> Y?
> >
> > Something like that:
> >
> > 
> >   org.apache.hive.hcatalog
> >   hcatalog-core
> >   0.12.0
> >   
> > 
> >   org.json
> >   json
> > 
> >   
> > 
> >
> > Thanks,
> > Stephan
> >
> >
> > On Thu, Nov 24, 2016 at 10:00 AM, Jochen Theodorou 
> > wrote:
> >
> > > is that library able to deal with the jdk9 module system?
> > >
> > >
> > > On 24.11.2016 02:16, James Bognar wrote:
> > >
> > >> Shameless plug for Apache Juneau that has a cleanroom implementation
> of
> > a
> > >> JSON serializer and parser in context of a common serialization API
> that
> > >> includes a variety of serialization languages for POJOs.
> > >>
> > >> On Wed, Nov 23, 2016 at 8:10 PM Ted Dunning 
> > >> wrote:
> > >>
> > >> The VP Legal for Apache has determined that the JSON processing
> library
> > >>> from json.org  is not usable
> as
> > a
> > >>> dependency by Apache projects. This is because the license includes a
> > >>> line
> > >>> that places a field of use condition on downstream users in a way
> that
> > is
> > >>> not compatible with Apache's license.
> > >>>
> > >>> This decision is, unfortunately, a change from the previous
> situation.
> > >>> While the current decision is correct, it would have been nice if we
> > had
> > >>> had this decision originally.
> > >>>
> > >>> As such, some existing projects may be impacted because they assumed
> > that
> > >>> the json.org dependency was OK to use.
> > >>>
> > >>> Incubator projects that are currently using the json.org library
> have
> > >>> several courses of action:
> > >>>
> > >>> 1) just drop it. Some projects like Storm have demos that use
> twitter4j
> > >>> which incorporates the problematic code. These demos aren't core and
> > >>> could
> > >>> just be dropped for a time.
> > >>>
> > >>> 2) help dependencies move away from problem code. I have sent a pull
> > >>> request to twitter4 j,
> > for
> > >>> example, that eliminates the problem. If they accept the pull, then
> all
> > >>> would be good for the projects that use twitter4j (and thus json.org
> )
> > >>>
> > >>> 3) replace the json.org artifact with a compatible one that is open
> > >>> source.
> > >>> I have created and published an artifact based on clean-room Android
> > code
> > >>>  that replicates the most
> > >>> important
> > >>> parts of the json.org code. This code is compatible, but lacks some
> > >>> coverage. It also could lead to jar hell if used unjudiciously
> because
> > it
> > >>> uses the org.json package. Shading and exclusion in a pom might help.
> > Or
> > >>> not. Go with caution here.
> > >>>
> > >>> 4) switch to safer alternatives such as Jackson. This requires code
> > >>> changes, but is probably a good thing to do. This option is the one
> > that
> > >>> is
> > >>> best in the long-term but is also the most expensive.
> > >>>
> > >>>
> > >>> -- Forwarded message --
> > >>> From: Jim Jagielski 
> > >>> Date: Wed, Nov 23, 2016 at 6:10 AM
> > >>> Subject: JSON License and Apache Projects
> > >>> To: ASF Board 
> > >>>
> > >>>
> > >>> (forwarded from legal-discuss@)
> > >>>
> > >>> As some of you may know, recently the JSON License has been
> > >>> moved to Category X (https://www.apache.org/legal/
> resolved#category-x
> > ).
> > >>>
> > >>> I understand that this has impacted some projects, especially
> > >>> those in the midst of doing a release. I also understand that
> > >>> up 

[jira] [Closed] (APEXCORE-566) Compute bytes transferred at the logical operator level

2016-11-24 Thread Francis Fernandes (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Fernandes closed APEXCORE-566.
--
Resolution: Won't Fix

> Compute bytes transferred at the logical operator level
> ---
>
> Key: APEXCORE-566
> URL: https://issues.apache.org/jira/browse/APEXCORE-566
> Project: Apache Apex Core
>  Issue Type: Improvement
>Affects Versions: 3.4.0
>Reporter: Francis Fernandes
>Assignee: Francis Fernandes
>
> Need to compute _totalBufferServerWriteBytesPSMA_ and 
> _totalBufferServerReadBytesPSMA_ at the logical operator level.
> Currently this is done at the application level. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-core pull request #414: APEXCORE-566 Computing bufferServerBytes in the...

2016-11-24 Thread francisf
Github user francisf closed the pull request at:

https://github.com/apache/apex-core/pull/414


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXCORE-566) Compute bytes transferred at the logical operator level

2016-11-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15694932#comment-15694932
 ] 

ASF GitHub Bot commented on APEXCORE-566:
-

Github user francisf closed the pull request at:

https://github.com/apache/apex-core/pull/414


> Compute bytes transferred at the logical operator level
> ---
>
> Key: APEXCORE-566
> URL: https://issues.apache.org/jira/browse/APEXCORE-566
> Project: Apache Apex Core
>  Issue Type: Improvement
>Affects Versions: 3.4.0
>Reporter: Francis Fernandes
>Assignee: Francis Fernandes
>
> Need to compute _totalBufferServerWriteBytesPSMA_ and 
> _totalBufferServerReadBytesPSMA_ at the logical operator level.
> Currently this is done at the application level. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Fwd: JSON License and Apache Projects

2016-11-24 Thread Amol Kekre
Chinmay,
Is this the thread you were looking for?

Thks
Amol

-- Forwarded message --
From: Ted Dunning 
Date: Thu, Nov 24, 2016 at 2:28 PM
Subject: Re: JSON License and Apache Projects
To: "gene...@incubator.apache.org" 


Stephan,

What you suggest should work (if you add another dependency to provide the
needed classes).

You have to be careful, however, because your consumers may expect to get
the full json.org API.

I would suggest that exclusions like this should only be used while your
direct dependency still has the dependency on json.org. When they fix it,
you can drop the exclusion and all will be good.



On Thu, Nov 24, 2016 at 2:21 AM, Stephan Ewen  wrote:

> Just to be on the safe side:
>
> If project X depends on another project Y that uses json.org (and thus
> project X has json.org as a transitive dependency) is it sufficient to
> exclude the transitive json.org dependency in the reference to project Y?
>
> Something like that:
>
> 
>   org.apache.hive.hcatalog
>   hcatalog-core
>   0.12.0
>   
> 
>   org.json
>   json
> 
>   
> 
>
> Thanks,
> Stephan
>
>
> On Thu, Nov 24, 2016 at 10:00 AM, Jochen Theodorou 
> wrote:
>
> > is that library able to deal with the jdk9 module system?
> >
> >
> > On 24.11.2016 02:16, James Bognar wrote:
> >
> >> Shameless plug for Apache Juneau that has a cleanroom implementation of
> a
> >> JSON serializer and parser in context of a common serialization API
that
> >> includes a variety of serialization languages for POJOs.
> >>
> >> On Wed, Nov 23, 2016 at 8:10 PM Ted Dunning 
> >> wrote:
> >>
> >> The VP Legal for Apache has determined that the JSON processing library
> >>> from json.org  is not usable as
> a
> >>> dependency by Apache projects. This is because the license includes a
> >>> line
> >>> that places a field of use condition on downstream users in a way that
> is
> >>> not compatible with Apache's license.
> >>>
> >>> This decision is, unfortunately, a change from the previous situation.
> >>> While the current decision is correct, it would have been nice if we
> had
> >>> had this decision originally.
> >>>
> >>> As such, some existing projects may be impacted because they assumed
> that
> >>> the json.org dependency was OK to use.
> >>>
> >>> Incubator projects that are currently using the json.org library have
> >>> several courses of action:
> >>>
> >>> 1) just drop it. Some projects like Storm have demos that use
twitter4j
> >>> which incorporates the problematic code. These demos aren't core and
> >>> could
> >>> just be dropped for a time.
> >>>
> >>> 2) help dependencies move away from problem code. I have sent a pull
> >>> request to twitter4 j,
> for
> >>> example, that eliminates the problem. If they accept the pull, then
all
> >>> would be good for the projects that use twitter4j (and thus json.org)
> >>>
> >>> 3) replace the json.org artifact with a compatible one that is open
> >>> source.
> >>> I have created and published an artifact based on clean-room Android
> code
> >>>  that replicates the most
> >>> important
> >>> parts of the json.org code. This code is compatible, but lacks some
> >>> coverage. It also could lead to jar hell if used unjudiciously because
> it
> >>> uses the org.json package. Shading and exclusion in a pom might help.
> Or
> >>> not. Go with caution here.
> >>>
> >>> 4) switch to safer alternatives such as Jackson. This requires code
> >>> changes, but is probably a good thing to do. This option is the one
> that
> >>> is
> >>> best in the long-term but is also the most expensive.
> >>>
> >>>
> >>> -- Forwarded message --
> >>> From: Jim Jagielski 
> >>> Date: Wed, Nov 23, 2016 at 6:10 AM
> >>> Subject: JSON License and Apache Projects
> >>> To: ASF Board 
> >>>
> >>>
> >>> (forwarded from legal-discuss@)
> >>>
> >>> As some of you may know, recently the JSON License has been
> >>> moved to Category X (https://www.apache.org/legal/resolved#category-x
> ).
> >>>
> >>> I understand that this has impacted some projects, especially
> >>> those in the midst of doing a release. I also understand that
> >>> up until now, really, there has been no real "outcry" over our
> >>> usage of it, especially from end-users and other consumers of
> >>> our projects which use it.
> >>>
> >>> As compelling as that is, the fact is that the JSON license
> >>> itself is not OSI approved and is therefore not, by definition,
> >>> an "Open Source license" and, as such, cannot be considered as
> >>> one which is acceptable as related to categories.
> >>>
> >>> Therefore, w/ my VP Legal hat on, I am making the following
> >>> statements:
> >>>
> >>>  o No new project, sub-project or codebase, which has not
> >>>used 

[jira] [Assigned] (APEXCORE-294) Graceful application shutdown

2016-11-24 Thread Tushar Gosavi (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tushar Gosavi reassigned APEXCORE-294:
--

Assignee: Tushar Gosavi

> Graceful application shutdown
> -
>
> Key: APEXCORE-294
> URL: https://issues.apache.org/jira/browse/APEXCORE-294
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Thomas Weise
>Assignee: Tushar Gosavi
>
> By injecting the end stream tuple into input operators, to replace the 
> current mechanism of forced operator undeploy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-405) Provide an API to launch DAG on the cluster

2016-11-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15694636#comment-15694636
 ] 

ASF GitHub Bot commented on APEXCORE-405:
-

GitHub user tweise opened a pull request:

https://github.com/apache/apex-core/pull/423

APEXCORE-405 Allow client to shutdown and check finished through app handle.

Based on my findings when trying to use this in Beam.

@PramodSSImmaneni @vrozov please merge, it is blocking the release. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tweise/apex-core APEXCORE-405

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-core/pull/423.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #423


commit 0be03527e361b023e79c0d9f9e8ef5e2632d64bf
Author: Thomas Weise 
Date:   2016-11-25T02:22:26Z

APEXCORE-405 Allow client to query if application was finished.




> Provide an API to launch DAG on the cluster
> ---
>
> Key: APEXCORE-405
> URL: https://issues.apache.org/jira/browse/APEXCORE-405
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Pramod Immaneni
>Assignee: Pramod Immaneni
> Fix For: 3.5.0
>
>
> Today API exists to launch a DAG in local mode but such an API is not 
> available to launch the app on the cluster, only a CLI tool is available. 
> Provide an API to be able to do this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-core pull request #423: APEXCORE-405 Allow client to shutdown and check...

2016-11-24 Thread tweise
GitHub user tweise opened a pull request:

https://github.com/apache/apex-core/pull/423

APEXCORE-405 Allow client to shutdown and check finished through app handle.

Based on my findings when trying to use this in Beam.

@PramodSSImmaneni @vrozov please merge, it is blocking the release. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tweise/apex-core APEXCORE-405

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-core/pull/423.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #423


commit 0be03527e361b023e79c0d9f9e8ef5e2632d64bf
Author: Thomas Weise 
Date:   2016-11-25T02:22:26Z

APEXCORE-405 Allow client to query if application was finished.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (APEXCORE-202) Integration with Samoa

2016-11-24 Thread Thomas Weise (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise closed APEXCORE-202.
-
Resolution: Fixed

> Integration with Samoa
> --
>
> Key: APEXCORE-202
> URL: https://issues.apache.org/jira/browse/APEXCORE-202
> Project: Apache Apex Core
>  Issue Type: New Feature
>Reporter: Siyuan Hua
>Assignee: Bhupesh Chawda
>  Labels: roadmap
>
> Apache Samoa[https://samoa.incubator.apache.org/] is an abstraction of a 
> collections of streaming machine learning Algorithm. By far, it has 
> integration with Samza, Storm and flink, It is a good start point for Apex to 
> support streaming ML.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Adding new log4j appender to Apex core

2016-11-24 Thread Vlad Rozov

David,

Yes, I understand that the functionality will only be available in case 
of the predefined log file appender. IMO, such assumption is too 
restrictive and it does not look that the default configuration will 
cover majority of use cases. Also, JIRA talks about providing offset in 
the log file. Does this mean that events can only be consumed by a tool? 
For human line number will be more useful.


Thank you,

Vlad

On 11/23/16 22:27, David Yan wrote:

Vlad,

The feature only works *if the log file name at error does not change later*
.

In this case, Priyanka proposes a default appender that the user can use
that has this behavior while log rotation is still supported.

If the user has a custom log appender, the feature can still work if it
satisfies the above requirement. Otherwise, there is no way for this
feature to work.

We can support a configuration option to force the inclusion of log
location and offset in the error STRAM event even if the Apex appender is
not used, if the user knows that they are using a custom appender that
satisfies the requirement.

David

On Wed, Nov 23, 2016 at 8:47 PM, Vlad Rozov  wrote:


Additionally, I think that it is necessary to re-evaluate the
requirements. Custom logging is quite common and many enterprises/Devops
have own preferences/policy for log rotation and logging format. I saw
instances when logging was redirected to stdout. By enforcing specific
rotation policy or log format, the feature is more likely not to be used.

Thank you,

Vlad


On 11/23/16 14:21, Vlad Rozov wrote:


Both approaches look quite "hacky" to me.

Thank you,

Vlad

On 11/23/16 00:01, Mohit Jotwani wrote:


+1 - Approach 2

Regards,
Mohit

On Wed, Nov 23, 2016 at 12:35 PM, AJAY GUPTA 
wrote:

+1 for approach 2.


Regards,
Ajay

On Wed, Nov 23, 2016 at 12:16 PM, David Yan 
wrote:

The goal of this log4j appender is to provide a log offset and the fixed

name of the container log file (instead of apex.log becoming apex.log.1


and


then apex.log.2, etc due to rotation) as part of an error STRAM event
so
users can easily locate the log entries around the error.

The user can override the appender, but in that case, the engine
detects
that and will not include the log location as part of the STRAM event.

David

On Tue, Nov 22, 2016 at 7:10 PM, Priyanka Gugale <


priya...@datatorrent.com


wrote:

Hi,

Thomas,
Yes log4j is ultimately owned by user, and they should be able to


override


it. What I am trying to do is provide a default behavior for Apex. In


case


user isn't using any logger of their own we should use this new


appender
of


Apex rather than using standard log4j appender as per hadoop config.

Sanjay,
Archetype is the good place to put this and I will add it there, but


many
time people won't use it. So I wanted to keep it at ~/.dt as well. Is
there


any other default config folder for Apex?

Also I am not relying on anything. If we fail to find config in app
jar


or


~/.dt we are going to skip usage of this new appender.

-Priyanka

On Wed, Nov 23, 2016 at 5:58 AM, Sanjay Pujare <
san...@datatorrent.com
wrote:

The only way to “enforce” this new appender is to update the
archetypes

(apex-app-archetype and apex-conf-archetype under apex-core/ )  to

use

the

new ones as default. But there does not seem to be a way to enforce


this
for anyone not using the archetypes.

I agree with not relying on ~/.dt in apex-core.

On 11/22/16, 1:08 PM, "Thomas Weise"  wrote:

  The log4j configuration is ultimately owned by the user, so how


do

you

want
  to enforce a custom appender?

  I don't think that this should rely on anything in ~/.dt either

  Thomas

  On Tue, Nov 22, 2016 at 10:00 AM, Priyanka Gugale <
priya...@datatorrent.com>
  wrote:

  > Hi,
  >
  > I am working on APEXCORE-563
  > 
  > As per this Jira we should put log file name in


container/operator
events.

  > The problem is current RollingFileAppender keeps renaming
files


from


1 to 2
  > to ... n as files reach maximum allowed file size. Because of
constant
  > renaming of files we can't put a fixed file name in stram


event.

  >

  > To overcome this I would like to add a new log4j appender to
ApexCore.
  > There are two ways I can implement this:
  > 1. Have Daily rolling file appender. The current file will be
recognized
  > based on timestamp in file name. Also to control max file
size,


we
need to

  > keep rolling files based on size as well.
  > 2. Have Rolling File Appender but do not rename files. When
max


file


size
  > is reached create new file with next number e.g. crate log
file
dt.log.2
  > after dt.log.1 is full. Also to recognize the latest file keep


the
softlink

  > named dt.log pointing to current log file.
  >
  > I would 

[jira] [Commented] (APEXMALHAR-2355) Filter POJO operator is missing schema annotation on the output ports

2016-11-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15693249#comment-15693249
 ] 

ASF GitHub Bot commented on APEXMALHAR-2355:


GitHub user PramodSSImmaneni opened a pull request:

https://github.com/apache/apex-malhar/pull/509

APEXMALHAR-2355 Added schema annotation

@yogidevendra @pradeepdalvi please see

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/PramodSSImmaneni/apex-malhar APEXMALHAR-2355

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/509.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #509


commit b2fa7b00a1cfd457bf3bdb835f0c30b2bb893a98
Author: Pramod Immaneni 
Date:   2016-11-24T13:03:39Z

APEXMALHAR-2355 Added schema annotation




> Filter POJO operator is missing schema annotation on the output ports
> -
>
> Key: APEXMALHAR-2355
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2355
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Pramod Immaneni
>Assignee: Pramod Immaneni
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #509: APEXMALHAR-2355 Added schema annotation

2016-11-24 Thread PramodSSImmaneni
GitHub user PramodSSImmaneni opened a pull request:

https://github.com/apache/apex-malhar/pull/509

APEXMALHAR-2355 Added schema annotation

@yogidevendra @pradeepdalvi please see

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/PramodSSImmaneni/apex-malhar APEXMALHAR-2355

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/509.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #509


commit b2fa7b00a1cfd457bf3bdb835f0c30b2bb893a98
Author: Pramod Immaneni 
Date:   2016-11-24T13:03:39Z

APEXMALHAR-2355 Added schema annotation




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (APEXMALHAR-2355) Filter POJO operator is missing schema annotation on the output ports

2016-11-24 Thread Pramod Immaneni (JIRA)
Pramod Immaneni created APEXMALHAR-2355:
---

 Summary: Filter POJO operator is missing schema annotation on the 
output ports
 Key: APEXMALHAR-2355
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2355
 Project: Apache Apex Malhar
  Issue Type: Bug
Reporter: Pramod Immaneni
Assignee: Pramod Immaneni






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2340) Initialize the list of JdbcFieldInfo in JdbcPOJOInsertOutput from properties.xml

2016-11-24 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2340.

   Resolution: Fixed
Fix Version/s: 3.6.0

> Initialize the list of JdbcFieldInfo in JdbcPOJOInsertOutput from 
> properties.xml
> 
>
> Key: APEXMALHAR-2340
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2340
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: Hitesh Kapoor
>Assignee: Hitesh Kapoor
> Fix For: 3.6.0
>
>
> Currently the list of JdbcFieldInfo is populated using java code.
> This should be done using properties.xml file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2340) Initialize the list of JdbcFieldInfo in JdbcPOJOInsertOutput from properties.xml

2016-11-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15693106#comment-15693106
 ] 

ASF GitHub Bot commented on APEXMALHAR-2340:


Github user asfgit closed the pull request at:

https://github.com/apache/apex-malhar/pull/507


> Initialize the list of JdbcFieldInfo in JdbcPOJOInsertOutput from 
> properties.xml
> 
>
> Key: APEXMALHAR-2340
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2340
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: Hitesh Kapoor
>Assignee: Hitesh Kapoor
>
> Currently the list of JdbcFieldInfo is populated using java code.
> This should be done using properties.xml file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #507: APEXMALHAR-2340 code changes to initialize th...

2016-11-24 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/apex-malhar/pull/507


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXCORE-575) Improve application relaunch time.

2016-11-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15693084#comment-15693084
 ] 

ASF GitHub Bot commented on APEXCORE-575:
-

GitHub user tushargosavi opened a pull request:

https://github.com/apache/apex-core/pull/422

APEXCORE-575 Improve application restart time.

I have implemented a new storage agent (CascadeStorageAgent) which 
maintains two storage agents one for old checkpoint directory and one for new 
checkpoint directory, using this storage agent we could direct read on old 
directory during initial start, and write to new checkpoint directory. With 
this we could avoid copy of checkpoints from old directory to new directory. 
with 2 GB state application restart was brought down to few seconds from 2 
minutes.

Other changes are
- Add log message to print time taken to copy the initial state.
- Do not copy stats and events directory as they are overwritten anyway in 
the new application.
- Use new storage agent to avoid copying checkpoints directory.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tushargosavi/apex-core restart_optimizations

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-core/pull/422.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #422


commit edc02612efbb786243e6a0188d40fe1e08ac941b
Author: Tushar R. Gosavi 
Date:   2016-11-18T18:06:15Z

APEXCORE-575 Improve application restart time.




> Improve application relaunch time.
> --
>
> Key: APEXCORE-575
> URL: https://issues.apache.org/jira/browse/APEXCORE-575
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Tushar Gosavi
>Assignee: Tushar Gosavi
>
> Improve application relaunch time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-core pull request #422: APEXCORE-575 Improve application restart time.

2016-11-24 Thread tushargosavi
GitHub user tushargosavi opened a pull request:

https://github.com/apache/apex-core/pull/422

APEXCORE-575 Improve application restart time.

I have implemented a new storage agent (CascadeStorageAgent) which 
maintains two storage agents one for old checkpoint directory and one for new 
checkpoint directory, using this storage agent we could direct read on old 
directory during initial start, and write to new checkpoint directory. With 
this we could avoid copy of checkpoints from old directory to new directory. 
with 2 GB state application restart was brought down to few seconds from 2 
minutes.

Other changes are
- Add log message to print time taken to copy the initial state.
- Do not copy stats and events directory as they are overwritten anyway in 
the new application.
- Use new storage agent to avoid copying checkpoints directory.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tushargosavi/apex-core restart_optimizations

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-core/pull/422.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #422


commit edc02612efbb786243e6a0188d40fe1e08ac941b
Author: Tushar R. Gosavi 
Date:   2016-11-18T18:06:15Z

APEXCORE-575 Improve application restart time.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXCORE-575) Improve application relaunch time.

2016-11-24 Thread Tushar Gosavi (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15693077#comment-15693077
 ] 

Tushar Gosavi commented on APEXCORE-575:


I have implemented a new storage agent which maintains two storage agents one 
for old checkpoint directory and one for new checkpoint directory, using this 
storage agent we could direct read on old directory during initial start, and 
write to new checkpoint directory. With this we could avoid copy of checkpoints 
from old directory to new directory. with 2 GB state
application restart was brought down to few seconds from 2 minutes.

Other changes are 
- Do not copy stats and events directory as they are overwritten anyway in the 
new application.
- Use new storage agent to avoid copying checkpoints directory.

{code}
16/11/24 03:51:48 INFO stram.FSRecoveryHandler: Creating 
hdfs://node18.morado.com:8020/user/tushar/datatorrent/apps/application_1479889815831_0086/recovery/log
16/11/24 03:51:48 INFO stram.StramClient: Copying initial state took 1191 ms
16/11/24 03:51:48 INFO stram.StramClient: Set the environment for the 
application master
16/11/24 03:51:48 INFO stram.StramClient: Setting up app master command
{code}

The old application state for app running for 10 minutes.
{code}
2.0 G5.9 Gdatatorrent/apps/application_1479889815831_0081/checkpoints
70.6 K   211.7 K  datatorrent/apps/application_1479889815831_0081/events
133.8 M  401.4 M  datatorrent/apps/application_1479889815831_0081/stats
{code}

> Improve application relaunch time.
> --
>
> Key: APEXCORE-575
> URL: https://issues.apache.org/jira/browse/APEXCORE-575
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Tushar Gosavi
>Assignee: Tushar Gosavi
>
> Improve application relaunch time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: APEXMALHAR-2354 Heuristic Watermark in windowed operator

2016-11-24 Thread Bhupesh Chawda
Good idea, +1

~ Bhupesh

On Thu, Nov 24, 2016 at 5:08 PM, Tushar Gosavi 
wrote:

> +1, but can you change name of HeuristicWatermark to
> WatermarkGenerator as the purpose of
> this interface is to generate watermarks.
>
> - Tushar.
>
>
> On Thu, Nov 24, 2016 at 4:22 PM, Chinmay Kolhatkar 
> wrote:
> > Dear Community,
> >
> > I'm working on adding support for heuristic watermark in Windowed
> Operator.
> > Heuristic watermark give users of WindowedOperator a way to logically
> > determine whether watermark condition is met or not by inspecting the
> > tuples received.
> > This can act as a replacement for or way to work along with Control Tuple
> > received on control port.
> >
> > Here is the approach I'm considering:
> >
> > 1. A new interface lets say "HeuristicWatermark" will be added which
> > extends Component
> > The reason why its extended with Component is then it can follow a
> > lifecycle.
> >
> > 2. This method contains a single method something like this:
> >
> > ControlTuple.Watermark processTupleForWatermark(
> Tuple.WindowedTuple
> > input);
> >
> > 3. Object of this type can optionally be set to AbstractWindowedOperator
> as
> > a plugin which identified whether watermark condition has reached.
> >
> > 4. If heuristicWatermark is set, processTupleForWatermark will be called
> > for every received tuple and the method can return the Watermark object
> if
> > watermark condition is met OR return null if not so.
> >
> > 5. If return value of this method is non-null, then processWatermark
> method
> > will be called which sets the nextWatermark value. And then rest of the
> > watermark processing can continue to happen in endWindow.
> >
> >
> > Please share your opinion on above approach.
> >
> > Thanks,
> > Chinmay.
>


Re: APEXMALHAR-2354 Heuristic Watermark in windowed operator

2016-11-24 Thread Tushar Gosavi
+1, but can you change name of HeuristicWatermark to
WatermarkGenerator as the purpose of
this interface is to generate watermarks.

- Tushar.


On Thu, Nov 24, 2016 at 4:22 PM, Chinmay Kolhatkar  wrote:
> Dear Community,
>
> I'm working on adding support for heuristic watermark in Windowed Operator.
> Heuristic watermark give users of WindowedOperator a way to logically
> determine whether watermark condition is met or not by inspecting the
> tuples received.
> This can act as a replacement for or way to work along with Control Tuple
> received on control port.
>
> Here is the approach I'm considering:
>
> 1. A new interface lets say "HeuristicWatermark" will be added which
> extends Component
> The reason why its extended with Component is then it can follow a
> lifecycle.
>
> 2. This method contains a single method something like this:
>
> ControlTuple.Watermark processTupleForWatermark(Tuple.WindowedTuple
> input);
>
> 3. Object of this type can optionally be set to AbstractWindowedOperator as
> a plugin which identified whether watermark condition has reached.
>
> 4. If heuristicWatermark is set, processTupleForWatermark will be called
> for every received tuple and the method can return the Watermark object if
> watermark condition is met OR return null if not so.
>
> 5. If return value of this method is non-null, then processWatermark method
> will be called which sets the nextWatermark value. And then rest of the
> watermark processing can continue to happen in endWindow.
>
>
> Please share your opinion on above approach.
>
> Thanks,
> Chinmay.


[jira] [Commented] (APEXMALHAR-2354) Add support for heuristic watermarks in WindowedOperator

2016-11-24 Thread Chinmay Kolhatkar (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15692958#comment-15692958
 ] 

Chinmay Kolhatkar commented on APEXMALHAR-2354:
---

Discussion is happening here:
https://lists.apache.org/thread.html/a34fb3419c43a1ed09a47d303155d8f0dc347af191e860695a48c71e@%3Cdev.apex.apache.org%3E


> Add support for heuristic watermarks in WindowedOperator
> 
>
> Key: APEXMALHAR-2354
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2354
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: Chinmay Kolhatkar
>Assignee: Chinmay Kolhatkar
>
> The purpose of this improvement is to add support to plugin heuristic 
> algorithms for determining watermarks in WindowedOperator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


APEXMALHAR-2354 Heuristic Watermark in windowed operator

2016-11-24 Thread Chinmay Kolhatkar
Dear Community,

I'm working on adding support for heuristic watermark in Windowed Operator.
Heuristic watermark give users of WindowedOperator a way to logically
determine whether watermark condition is met or not by inspecting the
tuples received.
This can act as a replacement for or way to work along with Control Tuple
received on control port.

Here is the approach I'm considering:

1. A new interface lets say "HeuristicWatermark" will be added which
extends Component
The reason why its extended with Component is then it can follow a
lifecycle.

2. This method contains a single method something like this:

ControlTuple.Watermark processTupleForWatermark(Tuple.WindowedTuple
input);

3. Object of this type can optionally be set to AbstractWindowedOperator as
a plugin which identified whether watermark condition has reached.

4. If heuristicWatermark is set, processTupleForWatermark will be called
for every received tuple and the method can return the Watermark object if
watermark condition is met OR return null if not so.

5. If return value of this method is non-null, then processWatermark method
will be called which sets the nextWatermark value. And then rest of the
watermark processing can continue to happen in endWindow.


Please share your opinion on above approach.

Thanks,
Chinmay.


[jira] [Created] (APEXMALHAR-2354) Add support for heuristic watermarks in WindowedOperator

2016-11-24 Thread Chinmay Kolhatkar (JIRA)
Chinmay Kolhatkar created APEXMALHAR-2354:
-

 Summary: Add support for heuristic watermarks in WindowedOperator
 Key: APEXMALHAR-2354
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2354
 Project: Apache Apex Malhar
  Issue Type: Improvement
Reporter: Chinmay Kolhatkar
Assignee: Chinmay Kolhatkar


The purpose of this improvement is to add support to plugin heuristic 
algorithms for determining watermarks in WindowedOperator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Proposal for apex/malhar extensions

2016-11-24 Thread Chinmay Kolhatkar
Thanks everyone for the response.
I've created a Jira for this:
https://issues.apache.org/jira/browse/APEXCORE-576

Feel free to contribute in any way possible to increasing contributions and
usage of Apache Apex.

Thanks,
Chinmay.


On Thu, Nov 17, 2016 at 2:56 AM, Pramod Immaneni 
wrote:

> Yes, I think it could be useful for core as well.
>
> On Wed, Nov 16, 2016 at 11:19 AM, Chinmay Kolhatkar 
> wrote:
>
> > @sanjay, yes we can define the process around this.
> >
> > @pramod, Well, I said apex-malhar because the extensions can be operators
> > and and other plugins to apex engine.
> > Do you see the use of this for apex-core as well?
> >
> > -Chinmay.
> >
> >
> > On Wed, Nov 16, 2016 at 7:24 PM, Pramod Immaneni  >
> > wrote:
> >
> > > So it would be like a yellow pages for the apex ecosystem. Sounds like
> a
> > > good idea. Why limit it to malhar?
> > >
> > > On Wed, Nov 16, 2016 at 3:17 AM, Chinmay Kolhatkar  >
> > > wrote:
> > >
> > > > Dear Community,
> > > >
> > > > This is in relation to malhar cleanup work that is ongoing.
> > > >
> > > > In one of the talks during Apache BigData Europe, I got to know about
> > > > Spark-Packages (https://spark-packages.org/) (I believe lot of you
> > must
> > > be
> > > > aware of it).
> > > > Spark package is basically functionality over and above and using
> Spark
> > > > core functionality. The spark packages can initially present in
> > someone's
> > > > public repository and one could register that with
> > > > https://spark-packages.org/ and later on as it matures and finds
> more
> > > use,
> > > > it gets consumed in mainstream Spark repository and releases.
> > > >
> > > > I found this idea quite interesting to keep our apex-malhar releases
> > > > cleaner.
> > > >
> > > > One could have extension to apex-malhar in their own repository and
> > just
> > > > register itself with Apache Apex. As it matures and find more and
> more
> > > use
> > > > we can consume that in mainstream releases.
> > > > Advantages to this are multiple:
> > > > 1. The entry point for registering extensions with Apache Apex can be
> > > > minimal. This way we get more indirect contributions.
> > > > 2. Faster way to add more feature in the project.
> > > > 3. We keep our releases cleaner.
> > > > 4. One could progress on feature-set faster balancing both Apache Way
> > as
> > > > well as their own Enterprise Interests.
> > > >
> > > > Please share your thoughts on this.
> > > >
> > > > Thanks,
> > > > Chinmay.
> > > >
> > >
> >
>