Re: [jira] [Assigned] (APEXMALHAR-2303) S3 Line By Line Module

2016-10-20 Thread Chaitanya Chebolu
+1 for new approach i.e, adding the file length to FileBlockMetadata.

On Thu, Oct 20, 2016 at 12:00 PM, Tushar Gosavi 
wrote:

> I think this approach is clean compare to previous two approached you
> have mentioned. Depending on exception/non standard error code to
> determine eof is not
> good approach, as we might consider other valid exception as eof and
> not take corrective actions. Also this will avoid multiple request
> to get file length from each reader.
>
> - Tushar.
>
>
> On Thu, Oct 20, 2016 at 11:45 AM, AJAY GUPTA  wrote:
> > Hi
> >
> > Following is another approach for getting information regarding the file
> > length for S3.
> >
> > We have an existing class FileBlockMetadata which currently contains only
> > filePath. To this, we can add the fileLength field which will then get
> > passed to the module. This approach will be a lot cleaner and no
> additional
> > requests will be made to S3 in this case.
> >
> > Kindly provide your opinion on which approach would be best suited.
> >
> >
> > Regards,
> > Ajay
> >
> > On Wed, Oct 19, 2016 at 6:43 PM, AJAY GUPTA 
> wrote:
> >
> >> Hi
> >>
> >> I need suggestion of Apex dev community on the following.
> >>
> >> For the S3RecordReader approach mentioned in previous mail, I am facing
> an
> >> issue with determining the end of file.
> >> Note that the input to this operator will not contain the file size.
> >>
> >> Following approaches are possible
> >>
> >> 1) The S3 getObject() call which fetches file data within a range will
> >> throw an AmazonS3Exception if the range provided is out of bounds.
> Hence if
> >> file size is 10bytes and if I make a getObject request for 11 to 15, I
> will
> >> get this exception.
> >> Exception in thread "main" com.amazonaws.services.s3.
> model.AmazonS3Exception:
> >> The requested range is not satisfiable (Service: Amazon S3; Status Code:
> >> 416; Error Code: InvalidRange; Request ID:
> >> If this exception gets thrown, I can catch it in the code and conclude
> >> that end of file is reached.
> >>
> >> 2) For every container running this application, maintain a
> map >> filesize>. If the filesize already exists in this map, use from there.
> If
> >> not, fetch the filesize information from S3 and add it to this map.
> >>
> >> My own opinion is to go with the first approach since the number of
> calls
> >> to S3 for getting file length will be less.
> >> Kindly provide with any other approaches you can think of.
> >>
> >>
> >> Thanks,
> >> Ajay
> >>
> >>
> >>
> >> On Wed, Oct 19, 2016 at 11:53 AM, AJAY GUPTA 
> wrote:
> >>
> >>> Hi Apex Dev community,
> >>>
> >>> Kindly provide with feedback if any for the following approach for
> >>> implementing S3RecordReader.
> >>>
> >>> *S3RecordReader(delimited records)*
> >>> *Input *: BlockMetaData containing offset and length
> >>> *Expected Output :* Records in the block
> >>> *Approach : *
> >>> Similar to approach currently being followed in FSRecordReader.
> >>> 1) Fetch the block from S3. S3 block fetch size should ideally be large
> >>> enough, say 64MB to avoid unnecessary network delays.
> >>> 2) Search for newline character in the block and emit the record
> >>> 3) The last record in current block might overflow into subsequent
> block.
> >>> For this, we will get a small part of subsequent block, say 1 MB and
> search
> >>> for newline character and emit the record if newline character is
> found. We
> >>> will fetch additional 1MB blocks till a newline charater is found.
> >>> 4) We will also avoid reading the first record from all blocks (except
> >>> first block) as this set of bytes is a part of last record in previous
> >>> block.
> >>>
> >>>
> >>> Regards,
> >>> Ajay
> >>>
> >>>
> >>>
> >>> On Wed, Oct 19, 2016 at 7:31 AM, Ajay Gupta (JIRA) 
> >>> wrote:
> >>>
> 
>   [ https://issues.apache.org/jira/browse/APEXMALHAR-2303?page=c
>  om.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
> 
>  Ajay Gupta reassigned APEXMALHAR-2303:
>  --
> 
>  Assignee: Ajay Gupta
> 
>  > S3 Line By Line Module
>  > --
>  >
>  > Key: APEXMALHAR-2303
>  > URL: https://issues.apache.org/jira
>  /browse/APEXMALHAR-2303
>  > Project: Apache Apex Malhar
>  >  Issue Type: Bug
>  >Reporter: Ajay Gupta
>  >Assignee: Ajay Gupta
>  >   Original Estimate: 336h
>  >  Remaining Estimate: 336h
>  >
>  > This is a new module which will consist of 2 operators
>  > 1) File Splitter -- Already existing in Malhar library
>  > 2) S3RecordReader -- Read a file from S3 and output the records
>  (delimited or fixed width)
> 
> 
> 
>  --
>  This message was sent by Atlassian JIRA
>  (v6.3.4#6332)
> 
> 

[jira] [Commented] (APEXMALHAR-1818) Integrate Calcite to support SQL

2016-10-20 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593796#comment-15593796
 ] 

Julian Hyde commented on APEXMALHAR-1818:
-

By the way, when you have committed this feature, feel free to update 
http://calcite.apache.org/docs/powered_by.html by submitting a pull-request for 
an edit to 
https://github.com/apache/calcite/blob/master/site/_docs/powered_by.md.

> Integrate Calcite to support SQL
> 
>
> Key: APEXMALHAR-1818
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-1818
> Project: Apache Apex Malhar
>  Issue Type: New Feature
>  Components: query operators, sql
>Reporter: Amol
>Assignee: Chinmay Kolhatkar
>  Labels: roadmap
>
> Once we have ability to generate a subdag, we should take a look at 
> integrating Calcite into Apex. The operator that enables populate DAG, should 
> use Calcite to generate the DAG, given a SQL query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-1818) Integrate Calcite to support SQL

2016-10-20 Thread Chinmay Kolhatkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kolhatkar updated APEXMALHAR-1818:
--
Component/s: sql

> Integrate Calcite to support SQL
> 
>
> Key: APEXMALHAR-1818
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-1818
> Project: Apache Apex Malhar
>  Issue Type: New Feature
>  Components: query operators, sql
>Reporter: Amol
>Assignee: Chinmay Kolhatkar
>  Labels: roadmap
>
> Once we have ability to generate a subdag, we should take a look at 
> integrating Calcite into Apex. The operator that enables populate DAG, should 
> use Calcite to generate the DAG, given a SQL query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #462: APEXMALHAR-2305 changed javadoc of SessionWin...

2016-10-20 Thread davidyan74
GitHub user davidyan74 opened a pull request:

https://github.com/apache/apex-malhar/pull/462

APEXMALHAR-2305 changed javadoc of SessionWindows to reflect the impl…

…ementation change of SessionWindows

@siyuanh please merge

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/davidyan74/apex-malhar APEXMALHAR-2305-3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/462.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #462


commit a059805799babf82ba6e3fc0da3d46dfcb505d73
Author: David Yan 
Date:   2016-10-20T19:38:54Z

APEXMALHAR-2305 changed javadoc of SessionWindows to reflect the 
implementation change of SessionWindows




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2305) Change implementation of session window to reflect what is described in streaming 102 blog

2016-10-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592768#comment-15592768
 ] 

ASF GitHub Bot commented on APEXMALHAR-2305:


GitHub user davidyan74 opened a pull request:

https://github.com/apache/apex-malhar/pull/462

APEXMALHAR-2305 changed javadoc of SessionWindows to reflect the impl…

…ementation change of SessionWindows

@siyuanh please merge

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/davidyan74/apex-malhar APEXMALHAR-2305-3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/462.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #462


commit a059805799babf82ba6e3fc0da3d46dfcb505d73
Author: David Yan 
Date:   2016-10-20T19:38:54Z

APEXMALHAR-2305 changed javadoc of SessionWindows to reflect the 
implementation change of SessionWindows




> Change implementation of session window to reflect what is described in 
> streaming 102 blog
> --
>
> Key: APEXMALHAR-2305
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2305
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: David Yan
>Assignee: David Yan
> Fix For: 3.6.0
>
>
> The proto-session windows described in the streaming 102 blog have a minimum 
> duration that is equal to the session gap. We should do the same in our 
> session window implementation. 
> https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2304) Apex SQL: Add examples for SQL in Apex in demos folder

2016-10-20 Thread Chinmay Kolhatkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kolhatkar updated APEXMALHAR-2304:
--
Component/s: sql

> Apex SQL: Add examples for SQL in Apex in demos folder
> --
>
> Key: APEXMALHAR-2304
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2304
> Project: Apache Apex Malhar
>  Issue Type: New Feature
>  Components: sql
>Reporter: Chinmay Kolhatkar
>Assignee: Chinmay Kolhatkar
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2203) Control tuple port and watermark support in high-level API (version 1)

2016-10-20 Thread Siyuan Hua (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyuan Hua updated APEXMALHAR-2203:
---
Summary: Control tuple port and watermark support in high-level API 
(version 1)  (was: Control tuple port and watermark support in high-level API)

> Control tuple port and watermark support in high-level API (version 1)
> --
>
> Key: APEXMALHAR-2203
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2203
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Affects Versions: 3.6.0
>Reporter: Siyuan Hua
>Assignee: Siyuan Hua
> Fix For: 3.6.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #460: APEXMALHAR-2305 updated documentation on Sess...

2016-10-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/apex-malhar/pull/460


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2305) Change implementation of session window to reflect what is described in streaming 102 blog

2016-10-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592536#comment-15592536
 ] 

ASF GitHub Bot commented on APEXMALHAR-2305:


Github user asfgit closed the pull request at:

https://github.com/apache/apex-malhar/pull/460


> Change implementation of session window to reflect what is described in 
> streaming 102 blog
> --
>
> Key: APEXMALHAR-2305
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2305
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: David Yan
>Assignee: David Yan
> Fix For: 3.6.0
>
>
> The proto-session windows described in the streaming 102 blog have a minimum 
> duration that is equal to the session gap. We should do the same in our 
> session window implementation. 
> https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2311) Apex SQL: Allow operators to be configured from outside

2016-10-20 Thread Chinmay Kolhatkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kolhatkar updated APEXMALHAR-2311:
--
Component/s: sql

> Apex SQL: Allow operators to be configured from outside
> ---
>
> Key: APEXMALHAR-2311
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2311
> Project: Apache Apex Malhar
>  Issue Type: New Feature
>  Components: sql
>Reporter: Chinmay Kolhatkar
>
> Currently there is no way to configure the operators added against given 
> Relational Tree from external file e.g. properties.xml etc..
> There should be a way to configure the added operators externally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #432: APEXMALHAR-1818 SQL Support for converting gi...

2016-10-20 Thread chinmaykolhatkar
GitHub user chinmaykolhatkar reopened a pull request:

https://github.com/apache/apex-malhar/pull/432

APEXMALHAR-1818 SQL Support for converting given SQL statement to APEX DAG

Features implemented are:
1. SELECT STATEMENT
2. INSERT STATEMENT
3. INNER JOIN with non-empty equi join condition
4. WHERE clause
5. SCALAR functions implemented in calcite are ready to use
6. Custom scalar functions can be registered.
7. Endpoint can be File OR Kafka OR Streaming Port for both input and output
8. CSV Data Format implemented for both input and output side.
9. Static loading of calcite JDBC driver.
10. Testing on local as well as cluster mode.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chinmaykolhatkar/apex-malhar calcite

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/432.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #432


commit 2958c1159780d1ae7f00c67ab61c32f78b903b1e
Author: Chandni Singh 
Date:   2016-10-13T12:03:18Z

APEXMALHAR-1818 Adding BeanClassGenerator for dynamically creating class

commit a60d79eab605b1dffea5f27a73ff198a02024847
Author: Chinmay Kolhatkar 
Date:   2016-10-13T12:05:16Z

APEXMALHAR-1818 SQL Support for converting given SQL statement to APEX DAG.

Features implemented are:
1. SELECT STATEMENT
2. INSERT STATEMENT
3. INNER JOIN with non-empty equi join condition
4. WHERE clause
5. SCALAR functions implemented in calcite are ready to use
6. Custom scalar functions can be registered.
7. Endpoint can be File OR Kafka OR Streaming Port for both input and output
8. CSV Data Format implemented for both input and output side.
9. Testing on local as well as cluster mode.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2203) Control tuple port and watermark support in high-level API (version 1)

2016-10-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592504#comment-15592504
 ] 

ASF GitHub Bot commented on APEXMALHAR-2203:


GitHub user siyuanh opened a pull request:

https://github.com/apache/apex-malhar/pull/461

APEXMALHAR-2203 support control tuple and watermark in high-level API



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/siyuanh/apex-malhar WindowedStream

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/461.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #461






> Control tuple port and watermark support in high-level API (version 1)
> --
>
> Key: APEXMALHAR-2203
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2203
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Affects Versions: 3.6.0
>Reporter: Siyuan Hua
>Assignee: Siyuan Hua
> Fix For: 3.6.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2203) Control tuple port and watermark support in high-level API (version 1)

2016-10-20 Thread Siyuan Hua (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592494#comment-15592494
 ] 

Siyuan Hua commented on APEXMALHAR-2203:


Support _single_ control tuple port

Support both watermark in operator itself and/or injecting watermark generation 
after operator based on time or data tuple itself


> Control tuple port and watermark support in high-level API (version 1)
> --
>
> Key: APEXMALHAR-2203
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2203
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Affects Versions: 3.6.0
>Reporter: Siyuan Hua
>Assignee: Siyuan Hua
> Fix For: 3.6.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXCORE-563) Have a pointer to container log file name and offset in stram events that deliver a container or operator failure event.

2016-10-20 Thread Sanjay M Pujare (JIRA)
Sanjay M Pujare created APEXCORE-563:


 Summary: Have a pointer to container log file name and offset in 
stram events that deliver a container or operator failure event.
 Key: APEXCORE-563
 URL: https://issues.apache.org/jira/browse/APEXCORE-563
 Project: Apache Apex Core
  Issue Type: Bug
Reporter: Sanjay M Pujare


The default DailyRollingFileAppender does not take into account of how many 
backup files to keep and it will result in unbounded growth of log files, 
especially for long running applications.
The below is an interesting add-on to the default DailyRollingFileAppender that 
supports maxBackupIndex.
http://wiki.apache.org/logging-log4j/DailyRollingFileAppender



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-1818) Integrate Calcite to support SQL

2016-10-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592355#comment-15592355
 ] 

ASF GitHub Bot commented on APEXMALHAR-1818:


Github user chinmaykolhatkar closed the pull request at:

https://github.com/apache/apex-malhar/pull/432


> Integrate Calcite to support SQL
> 
>
> Key: APEXMALHAR-1818
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-1818
> Project: Apache Apex Malhar
>  Issue Type: New Feature
>  Components: query operators
>Reporter: Amol
>Assignee: Chinmay Kolhatkar
>  Labels: roadmap
>
> Once we have ability to generate a subdag, we should take a look at 
> integrating Calcite into Apex. The operator that enables populate DAG, should 
> use Calcite to generate the DAG, given a SQL query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2310) Apex SQL: Reuse the schema generated automatically in SQL DAG

2016-10-20 Thread Chinmay Kolhatkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kolhatkar updated APEXMALHAR-2310:
--
Component/s: sql

> Apex SQL: Reuse the schema generated automatically in SQL DAG
> -
>
> Key: APEXMALHAR-2310
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2310
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>  Components: sql
>Reporter: Chinmay Kolhatkar
>
> Currently, for each stream a new class (POJO) is generated and assigned to 
> port TUPLE_CLASS. In there is repetition of the same schema, we should reuse 
> the class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2296) Apex SQL: Add support for SQL GROUP BY (Aggregate RelNode)

2016-10-20 Thread Chinmay Kolhatkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kolhatkar updated APEXMALHAR-2296:
--
Component/s: sql

> Apex SQL: Add support for SQL GROUP BY (Aggregate RelNode)
> --
>
> Key: APEXMALHAR-2296
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2296
> Project: Apache Apex Malhar
>  Issue Type: New Feature
>  Components: sql
>Reporter: Chinmay Kolhatkar
>Assignee: Chinmay Kolhatkar
>
> Add support for SQL GROUP BY (Aggregate RelNode)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Make cli support high-level API and SQL

2016-10-20 Thread Chinmay Kolhatkar
Are you talking about apex cli (the one present in apex-core) to support
both?

As both high-level API and SQL are in malhar, wouldn't there be circular
dependency issue?

Beside the possible issues, this is a great idea.

Though, I'll suggest to have a seperate cli for high level and SQL in
malhar itself first and then later make that loadable/callable from apexcli.


On Thu, Oct 20, 2016 at 10:19 PM, Siyuan Hua  wrote:

> Given we already have first version of high-level API and SQL PR is getting
> close to be merged. And also we have ability to launch application
> grammatically. I think it's nice to have our cli support both high-level
> API and SQL directly so people can simply prototype what they need and/or
> try out Apex with more flexibility.
>
> Any thoughts?
>
>
> Regards,
> Siyuan
>


Make cli support high-level API and SQL

2016-10-20 Thread Siyuan Hua
Given we already have first version of high-level API and SQL PR is getting
close to be merged. And also we have ability to launch application
grammatically. I think it's nice to have our cli support both high-level
API and SQL directly so people can simply prototype what they need and/or
try out Apex with more flexibility.

Any thoughts?


Regards,
Siyuan


[jira] [Commented] (APEXCORE-560) Logical plan is not changed when all physical partitions of operator are removed from DAG

2016-10-20 Thread Vlad Rozov (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591998#comment-15591998
 ] 

Vlad Rozov commented on APEXCORE-560:
-

Why is it necessary to remove an operator from the logical plan? A logical plan 
does not correspond one to one to the physical plan. For example, unifiers are 
present only in the physical plan.

> Logical plan is not changed when all physical partitions of operator are 
> removed from DAG
> -
>
> Key: APEXCORE-560
> URL: https://issues.apache.org/jira/browse/APEXCORE-560
> Project: Apache Apex Core
>  Issue Type: Bug
>Reporter: Bhupesh Chawda
>
> Throwing a ShutdownException() from an input operator removes them from the 
> physical plan, but can still be seen in the logical plan. Ideally the 
> corresponding logical operator must also be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2017) Use pre checkpoint notification to optimize operator IO

2016-10-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591969#comment-15591969
 ] 

ASF GitHub Bot commented on APEXMALHAR-2017:


Github user asfgit closed the pull request at:

https://github.com/apache/apex-malhar/pull/441


> Use pre checkpoint notification to optimize operator IO
> ---
>
> Key: APEXMALHAR-2017
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2017
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: Pramod Immaneni
>Assignee: Velineni Lakshmi Prasanna
>
> Currently many output operators enforce persistence of data on endWindow by 
> calling flush, hflush or equivalent calls. This was done to help recovery. 
> Doing this always ensures that the data corresponding to checkpoint state at 
> recovery is always present.
> A recent addition to the engine lets the operators know about an impending 
> checkpoint just before it happens using a callback. Operators can now enforce 
> persistence of data one time in this in this callback instead of end of every 
> window. This results in better performance as data is not being frequently 
> written to persistent storage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2017) Use pre checkpoint notification to optimize operator IO

2016-10-20 Thread Thomas Weise (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise resolved APEXMALHAR-2017.
--
   Resolution: Fixed
Fix Version/s: 3.6.0

> Use pre checkpoint notification to optimize operator IO
> ---
>
> Key: APEXMALHAR-2017
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2017
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: Pramod Immaneni
>Assignee: Velineni Lakshmi Prasanna
> Fix For: 3.6.0
>
>
> Currently many output operators enforce persistence of data on endWindow by 
> calling flush, hflush or equivalent calls. This was done to help recovery. 
> Doing this always ensures that the data corresponding to checkpoint state at 
> recovery is always present.
> A recent addition to the engine lets the operators know about an impending 
> checkpoint just before it happens using a callback. Operators can now enforce 
> persistence of data one time in this in this callback instead of end of every 
> window. This results in better performance as data is not being frequently 
> written to persistent storage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #441: APEXMALHAR-2017

2016-10-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/apex-malhar/pull/441


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: S3 Output Module

2016-10-20 Thread Chaitanya Chebolu
Hi All,

I am proposing the below new design for S3 Output Module using multi part
upload feature:

Input to this Module: FileMetadata, FileBlockMetadata, ReaderRecord

Steps for uploading files using S3 multipart feature:

=

   1.

   Initiate the upload. S3 will return upload id.

Mandatory : bucket name, file path

Note: Upload id is the unique identifier for multi part upload of a file.

   1.

   Upload each block using the received upload id. S3 will return ETag in
   response of each upload.

Mandatory: block number, upload id

   1.

   Send the merge request by providing the upload id and list of ETags .

Mandatory: upload id, file path, block ETags.

Here 
is an example link for uploading a file using multi part feature:


I am proposing the below two approaches for S3 output module.


(Solution 1)

S3 Output Module consists of the below two operators:

1) BlockWriter : Write the blocks into the HDFS. Once successfully written
into HDFS, then this will emit the BlockMetadata.

2) S3MultiPartUpload: This consists of two parts:

 a) If the number of blocks of a file is > 1 then upload the blocks
using multi part feature. Otherwise, will upload the block using
putObject().

 b) Once all the blocks are successfully uploaded then will send the
merge complete request.


(Solution 2)

DAG for this solution as follows:

1) InitateS3Upload:

Input: FileMetadata

Initiates the upload. This operator emits (filemetadata, uploadId) to
S3FileMerger and (filePath, uploadId) to S3BlockUpload.

2) S3BlockUpload:

Input: FileBlockMetadata, ReaderRecord

Upload the blocks into S3. S3 will return ETag for each upload.
S3BlockUpload emits (path, ETag) to S3FileMerger.

3) S3FileMerger: Sends the file merge request to S3.

Pros:

(1) Supports the size of file to upload is up to 5 TB.

(2) Reduces the end to end latency. Because, we are not waiting to upload
until all the blocks of a file written to HDFS.

Please vote and share your thoughts on these approaches.

Regards,
Chaitanya

On Tue, Mar 29, 2016 at 2:35 PM, Chaitanya Chebolu <
chaita...@datatorrent.com> wrote:

> @ Tushar
>
>   S3 Copy Output Module consists of following operators:
> 1) BlockWriter : Writes the blocks into the HDFS.
> 2) Synchronizer: Sends trigger to downstream operator, when all the blocks
> for a file written to HDFS.
> 3) FileMerger: Merges all the blocks into a file and will upload the
> merged file into S3 bucket.
>
> @ Ashwin
>
> Good suggestion. In the first iteration, I will add the proposed
> design.
> Multipart support will add it in the next iteration.
>
> Regards,
> Chaitanya
>
> On Thu, Mar 24, 2016 at 2:44 AM, Ashwin Chandra Putta <
> ashwinchand...@gmail.com> wrote:
>
>> +1 regarding the s3 upload functionality.
>>
>> However, I think we should just focus on multipart upload directly as it
>> comes with various advantages like higher throughput, faster recovery, not
>> needing to wait for entire file being created before uploading each part.
>> See: http://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusin
>> gmpu.html
>>
>> Also, seems like we can do multipart upload if the file size is more than
>> 5MB. They do recommend using multipart if the file size is more than
>> 100MB.
>> I am not sure if there is a hard lower limit though. See:
>> http://docs.aws.amazon.com/AmazonS3/latest/dev/UploadingObjects.html
>>
>> This way, it seems like we don't to have to wait until a file is
>> completely
>> written to hdfs before performing the upload operation.
>>
>> Regards,
>> Ashwin.
>>
>> On Wed, Mar 23, 2016 at 5:10 AM, Tushar Gosavi 
>> wrote:
>>
>> > +1 , we need this functionality.
>> >
>> > Is it going to be a single operator or multiple operators? If multiple
>> > operators, then can you explain what functionality each operator will
>> > provide?
>> >
>> >
>> > Regards,
>> > -Tushar.
>> >
>> >
>> > On Wed, Mar 23, 2016 at 5:01 PM, Yogi Devendra > >
>> > wrote:
>> >
>> > > Writing to S3 is a common use-case for applications.
>> > > This module will be definitely helpful.
>> > >
>> > > +1 for adding this module.
>> > >
>> > >
>> > > ~ Yogi
>> > >
>> > > On 22 March 2016 at 13:52, Chaitanya Chebolu <
>> chaita...@datatorrent.com>
>> > > wrote:
>> > >
>> > > > Hi All,
>> > > >
>> > > >   I am proposing S3 output copy Module. Primary functionality of
>> this
>> > > > module is uploading files to S3 bucket using block-by-block
>> approach.
>> > > >
>> > > >   Below is the JIRA created for this task:
>> > > > https://issues.apache.org/jira/browse/APEXMALHAR-2022
>> > > >
>> > > >   Design of this module is similar to HDFS copy module. So, I will
>> > extend
>> > > > HDFS copy module for S3.
>> > > >
>> > > > Design of this Module:
>> > > > ===
>> > > > 1) Writing blocks into HDFS.
>> > > > 2) Merge the blocks into a file .
>> > > > 3) Upload the above 

[GitHub] apex-malhar pull request #452: APEXMALHAR-2290 Optimization to fetch meta da...

2016-10-20 Thread Hitesh-Scorpio
Github user Hitesh-Scorpio closed the pull request at:

https://github.com/apache/apex-malhar/pull/452


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (APEXCORE-532) New dynamically added operator does not start with correct windowId.

2016-10-20 Thread Thomas Weise (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise updated APEXCORE-532:
--
 Assignee: Tushar Gosavi
Fix Version/s: 3.5.0

> New dynamically added operator does not start with correct windowId.
> 
>
> Key: APEXCORE-532
> URL: https://issues.apache.org/jira/browse/APEXCORE-532
> Project: Apache Apex Core
>  Issue Type: Bug
>Reporter: Tushar Gosavi
>Assignee: Tushar Gosavi
>Priority: Critical
> Fix For: 3.5.0
>
>
> During dynamic DAG change, If new operator is added and connected to existing 
> operator, it does not starts with correct windowId. The baseSeconds is set to 
> 0 causing windowId management problems at master effectively halting purge 
> from buffer server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-532) New dynamically added operator does not start with correct windowId.

2016-10-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591665#comment-15591665
 ] 

ASF GitHub Bot commented on APEXCORE-532:
-

Github user asfgit closed the pull request at:

https://github.com/apache/apex-core/pull/402


> New dynamically added operator does not start with correct windowId.
> 
>
> Key: APEXCORE-532
> URL: https://issues.apache.org/jira/browse/APEXCORE-532
> Project: Apache Apex Core
>  Issue Type: Bug
>Reporter: Tushar Gosavi
>Priority: Critical
>
> During dynamic DAG change, If new operator is added and connected to existing 
> operator, it does not starts with correct windowId. The baseSeconds is set to 
> 0 causing windowId management problems at master effectively halting purge 
> from buffer server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-core pull request #402: APEXCORE-532: Fix issue where new operators add...

2016-10-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/apex-core/pull/402


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] apex-malhar pull request #452: APEXMALHAR-2290 Optimization to fetch meta da...

2016-10-20 Thread Hitesh-Scorpio
GitHub user Hitesh-Scorpio reopened a pull request:

https://github.com/apache/apex-malhar/pull/452

APEXMALHAR-2290 Optimization to fetch meta data in JdbcPOJOInsertOutput 
Operator.

@bhupeshchawda @chinmaykolhatkar please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Hitesh-Scorpio/apex-malhar 
APEXMALHAR-2290_JDBCMaxRows

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/452.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #452


commit cff582f5581bd07e5ca3398a63b171fccd069785
Author: Hitesh-Scorpio 
Date:   2016-10-12T15:00:22Z

APEXMALHAR-2290 fix to optimize the function which was populating meta data 
for columns




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2290) JDBC operator does not deploy in certain cases

2016-10-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591693#comment-15591693
 ] 

ASF GitHub Bot commented on APEXMALHAR-2290:


Github user Hitesh-Scorpio closed the pull request at:

https://github.com/apache/apex-malhar/pull/452


> JDBC operator does not deploy in certain cases
> --
>
> Key: APEXMALHAR-2290
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2290
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Hitesh Kapoor
>Assignee: Hitesh Kapoor
> Fix For: 3.6.0
>
>
> In some cases the JDBC output operator does not deploy properly. The operator 
> gets killed and does not gets deployed. 
> The setup method uses "select * from tablename" to just get types of fields 
> in the table. If the tables already has lot of data then select queries runs 
> for more than 30 sec and the operator is killed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2290) JDBC operator does not deploy in certain cases

2016-10-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591695#comment-15591695
 ] 

ASF GitHub Bot commented on APEXMALHAR-2290:


GitHub user Hitesh-Scorpio reopened a pull request:

https://github.com/apache/apex-malhar/pull/452

APEXMALHAR-2290 Optimization to fetch meta data in JdbcPOJOInsertOutput 
Operator.

@bhupeshchawda @chinmaykolhatkar please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Hitesh-Scorpio/apex-malhar 
APEXMALHAR-2290_JDBCMaxRows

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/452.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #452


commit cff582f5581bd07e5ca3398a63b171fccd069785
Author: Hitesh-Scorpio 
Date:   2016-10-12T15:00:22Z

APEXMALHAR-2290 fix to optimize the function which was populating meta data 
for columns




> JDBC operator does not deploy in certain cases
> --
>
> Key: APEXMALHAR-2290
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2290
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Hitesh Kapoor
>Assignee: Hitesh Kapoor
> Fix For: 3.6.0
>
>
> In some cases the JDBC output operator does not deploy properly. The operator 
> gets killed and does not gets deployed. 
> The setup method uses "select * from tablename" to just get types of fields 
> in the table. If the tables already has lot of data then select queries runs 
> for more than 30 sec and the operator is killed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #452: APEXMALHAR-2290 Optimization to fetch meta da...

2016-10-20 Thread Hitesh-Scorpio
GitHub user Hitesh-Scorpio reopened a pull request:

https://github.com/apache/apex-malhar/pull/452

APEXMALHAR-2290 Optimization to fetch meta data in JdbcPOJOInsertOutput 
Operator.

@bhupeshchawda @chinmaykolhatkar please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Hitesh-Scorpio/apex-malhar 
APEXMALHAR-2290_JDBCMaxRows

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/452.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #452


commit cff582f5581bd07e5ca3398a63b171fccd069785
Author: Hitesh-Scorpio 
Date:   2016-10-12T15:00:22Z

APEXMALHAR-2290 fix to optimize the function which was populating meta data 
for columns




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] apex-malhar pull request #452: APEXMALHAR-2290 Optimization to fetch meta da...

2016-10-20 Thread Hitesh-Scorpio
Github user Hitesh-Scorpio closed the pull request at:

https://github.com/apache/apex-malhar/pull/452


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXCORE-532) New dynamically added operator does not start with correct windowId.

2016-10-20 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591671#comment-15591671
 ] 

Thomas Weise commented on APEXCORE-532:
---

Still need to look into input operator addition scenario. Keeping JIRA open 
till release. 

> New dynamically added operator does not start with correct windowId.
> 
>
> Key: APEXCORE-532
> URL: https://issues.apache.org/jira/browse/APEXCORE-532
> Project: Apache Apex Core
>  Issue Type: Bug
>Reporter: Tushar Gosavi
>Assignee: Tushar Gosavi
>Priority: Critical
> Fix For: 3.5.0
>
>
> During dynamic DAG change, If new operator is added and connected to existing 
> operator, it does not starts with correct windowId. The baseSeconds is set to 
> 0 causing windowId management problems at master effectively halting purge 
> from buffer server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2311) Apex SQL: Allow operators to be configured from outside

2016-10-20 Thread Chinmay Kolhatkar (JIRA)
Chinmay Kolhatkar created APEXMALHAR-2311:
-

 Summary: Apex SQL: Allow operators to be configured from outside
 Key: APEXMALHAR-2311
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2311
 Project: Apache Apex Malhar
  Issue Type: New Feature
Reporter: Chinmay Kolhatkar


Currently there is no way to configure the operators added against given 
Relational Tree from external file e.g. properties.xml etc..

There should be a way to configure the added operators externally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2310) Apex SQL: Reuse the schema generated automatically in SQL DAG

2016-10-20 Thread Chinmay Kolhatkar (JIRA)
Chinmay Kolhatkar created APEXMALHAR-2310:
-

 Summary: Apex SQL: Reuse the schema generated automatically in SQL 
DAG
 Key: APEXMALHAR-2310
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2310
 Project: Apache Apex Malhar
  Issue Type: Improvement
Reporter: Chinmay Kolhatkar


Currently, for each stream a new class (POJO) is generated and assigned to port 
TUPLE_CLASS. In there is repetition of the same schema, we should reuse the 
class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-560) Logical plan is not changed when all physical partitions of operator are removed from DAG

2016-10-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591398#comment-15591398
 ] 

ASF GitHub Bot commented on APEXCORE-560:
-

GitHub user bhupeshchawda opened a pull request:

https://github.com/apache/apex-core/pull/413

APEXCORE-560: Fixed logical operator removal when all physical partit…

…ions are removed
Added unit test.

@tushargosavi  Please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bhupeshchawda/apex-core 
APEXCORE-560-remove-logical-operator

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-core/pull/413.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #413


commit 15d8460a9a21db661f3117171c674c95263bd1aa
Author: bhupeshchawda 
Date:   2016-10-20T09:42:04Z

APEXCORE-560: Fixed logical operator removal when all physical partitions 
are removed




> Logical plan is not changed when all physical partitions of operator are 
> removed from DAG
> -
>
> Key: APEXCORE-560
> URL: https://issues.apache.org/jira/browse/APEXCORE-560
> Project: Apache Apex Core
>  Issue Type: Bug
>Reporter: Bhupesh Chawda
>
> Throwing a ShutdownException() from an input operator removes them from the 
> physical plan, but can still be seen in the logical plan. Ideally the 
> corresponding logical operator must also be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2290) JDBC operator does not deploy in certain cases

2016-10-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591292#comment-15591292
 ] 

ASF GitHub Bot commented on APEXMALHAR-2290:


GitHub user Hitesh-Scorpio reopened a pull request:

https://github.com/apache/apex-malhar/pull/452

APEXMALHAR-2290 Optimization to fetch meta data in JdbcPOJOInsertOutput 
Operator.

@bhupeshchawda @chinmaykolhatkar please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Hitesh-Scorpio/apex-malhar 
APEXMALHAR-2290_JDBCMaxRows

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/452.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #452


commit cff582f5581bd07e5ca3398a63b171fccd069785
Author: Hitesh-Scorpio 
Date:   2016-10-12T15:00:22Z

APEXMALHAR-2290 fix to optimize the function which was populating meta data 
for columns




> JDBC operator does not deploy in certain cases
> --
>
> Key: APEXMALHAR-2290
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2290
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Hitesh Kapoor
>Assignee: Hitesh Kapoor
> Fix For: 3.6.0
>
>
> In some cases the JDBC output operator does not deploy properly. The operator 
> gets killed and does not gets deployed. 
> The setup method uses "select * from tablename" to just get types of fields 
> in the table. If the tables already has lot of data then select queries runs 
> for more than 30 sec and the operator is killed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2290) JDBC operator does not deploy in certain cases

2016-10-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591374#comment-15591374
 ] 

ASF GitHub Bot commented on APEXMALHAR-2290:


Github user Hitesh-Scorpio closed the pull request at:

https://github.com/apache/apex-malhar/pull/452


> JDBC operator does not deploy in certain cases
> --
>
> Key: APEXMALHAR-2290
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2290
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Hitesh Kapoor
>Assignee: Hitesh Kapoor
> Fix For: 3.6.0
>
>
> In some cases the JDBC output operator does not deploy properly. The operator 
> gets killed and does not gets deployed. 
> The setup method uses "select * from tablename" to just get types of fields 
> in the table. If the tables already has lot of data then select queries runs 
> for more than 30 sec and the operator is killed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-core pull request #413: APEXCORE-560: Fixed logical operator removal wh...

2016-10-20 Thread bhupeshchawda
GitHub user bhupeshchawda opened a pull request:

https://github.com/apache/apex-core/pull/413

APEXCORE-560: Fixed logical operator removal when all physical partit…

…ions are removed
Added unit test.

@tushargosavi  Please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bhupeshchawda/apex-core 
APEXCORE-560-remove-logical-operator

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-core/pull/413.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #413


commit 15d8460a9a21db661f3117171c674c95263bd1aa
Author: bhupeshchawda 
Date:   2016-10-20T09:42:04Z

APEXCORE-560: Fixed logical operator removal when all physical partitions 
are removed




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2290) JDBC operator does not deploy in certain cases

2016-10-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591375#comment-15591375
 ] 

ASF GitHub Bot commented on APEXMALHAR-2290:


GitHub user Hitesh-Scorpio reopened a pull request:

https://github.com/apache/apex-malhar/pull/452

APEXMALHAR-2290 Optimization to fetch meta data in JdbcPOJOInsertOutput 
Operator.

@bhupeshchawda @chinmaykolhatkar please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Hitesh-Scorpio/apex-malhar 
APEXMALHAR-2290_JDBCMaxRows

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/452.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #452


commit cff582f5581bd07e5ca3398a63b171fccd069785
Author: Hitesh-Scorpio 
Date:   2016-10-12T15:00:22Z

APEXMALHAR-2290 fix to optimize the function which was populating meta data 
for columns




> JDBC operator does not deploy in certain cases
> --
>
> Key: APEXMALHAR-2290
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2290
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Hitesh Kapoor
>Assignee: Hitesh Kapoor
> Fix For: 3.6.0
>
>
> In some cases the JDBC output operator does not deploy properly. The operator 
> gets killed and does not gets deployed. 
> The setup method uses "select * from tablename" to just get types of fields 
> in the table. If the tables already has lot of data then select queries runs 
> for more than 30 sec and the operator is killed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2290) JDBC operator does not deploy in certain cases

2016-10-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591291#comment-15591291
 ] 

ASF GitHub Bot commented on APEXMALHAR-2290:


Github user Hitesh-Scorpio closed the pull request at:

https://github.com/apache/apex-malhar/pull/452


> JDBC operator does not deploy in certain cases
> --
>
> Key: APEXMALHAR-2290
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2290
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Hitesh Kapoor
>Assignee: Hitesh Kapoor
> Fix For: 3.6.0
>
>
> In some cases the JDBC output operator does not deploy properly. The operator 
> gets killed and does not gets deployed. 
> The setup method uses "select * from tablename" to just get types of fields 
> in the table. If the tables already has lot of data then select queries runs 
> for more than 30 sec and the operator is killed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #452: APEXMALHAR-2290 Optimization to fetch meta da...

2016-10-20 Thread Hitesh-Scorpio
Github user Hitesh-Scorpio closed the pull request at:

https://github.com/apache/apex-malhar/pull/452


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---