Re: [ANNOUNCE] New Apache Apex PMC Member: Chinmay Kolhatkar
Congrats Chinmay !!! On Fri, May 25, 2018 at 1:54 PM, Shubhrajyoti Mohapatra < sjmohapat...@gmail.com> wrote: > Congratulations Chinmay! > > On Fri, May 25, 2018 at 1:06 PM, Ambarish Pande < > ambarish.pande2...@gmail.com> wrote: > > > Congratulations Chinmay! > > > > On Fri, 25 May 2018 at 12:56 PM, priyanka gugale <pri...@apache.org> > > wrote: > > > > > Congrats Chinmay!! > > > > > > -Priyanka > > > > > > On Fri, May 25, 2018, 11:15 AM vikram patil <patilvik...@gmail.com> > > wrote: > > > > > > > Congratulations Chinmay !!! > > > > > > > > On Fri, May 25, 2018 at 11:00 AM, Ashwin Chandra Putta > > > > <ashwinchand...@gmail.com> wrote: > > > > > Congrats Chinmay. Well deserved. > > > > > > > > > > On Thu, May 24, 2018 at 10:10 PM Vlad Rozov <vro...@apache.org> > > wrote: > > > > > > > > > >> Congrats Chinmay! > > > > >> > > > > >> Thank you, > > > > >> > > > > >> Vlad > > > > >> > > > > >> On 5/24/18 12:27, Ananth G wrote: > > > > >> > Congratulations Chinmay > > > > >> > > > > > >> > Regards > > > > >> > Ananth > > > > >> > > > > > >> >> On 25 May 2018, at 4:19 am, amol kekre <amolhke...@gmail.com> > > > wrote: > > > > >> >> > > > > >> >> Congrats Chinmay > > > > >> >> > > > > >> >> Amol > > > > >> >> > > > > >> >> On Thu, May 24, 2018 at 10:34 AM, Hitesh Kapoor < > > > > forhitesh...@gmail.com > > > > >> > > > > > >> >> wrote: > > > > >> >> > > > > >> >>> Congratulations Chinmay!! > > > > >> >>> > > > > >> >>> Regards, > > > > >> >>> Hitesh Kapoor > > > > >> >>> > > > > >> >>>> On Thu 24 May, 2018, 11:02 PM Ilya Ganelin, < > > ilgan...@gmail.com> > > > > >> wrote: > > > > >> >>>> > > > > >> >>>> Congrats! > > > > >> >>>> > > > > >> >>>> On Thu, May 24, 2018, 10:09 AM Pramod Immaneni < > > > pra...@apache.org> > > > > >> >>> wrote: > > > > >> >>>>> Congratulations Chinmay. > > > > >> >>>>> > > > > >> >>>>>> On Thu, May 24, 2018 at 9:39 AM Thomas Weise < > t...@apache.org > > > > > > > >> wrote: > > > > >> >>>>>> > > > > >> >>>>>> The Apache Apex PMC is pleased to announce that Chinmay > > > > Kolhatkar is > > > > >> >>>> now > > > > >> >>>>> a > > > > >> >>>>>> PMC member. > > > > >> >>>>>> > > > > >> >>>>>> Chinmay has contributed to Apex in many ways, including: > > > > >> >>>>>> > > > > >> >>>>>> - Various transform operators in Malhar > > > > >> >>>>>> - SQL translation based on Calcite > > > > >> >>>>>> - Apache Bigtop integration > > > > >> >>>>>> - Docker sandbox > > > > >> >>>>>> - Blogs and conference presentations > > > > >> >>>>>> > > > > >> >>>>>> We appreciate all his contributions to the project so far, > > and > > > > are > > > > >> >>>>> looking > > > > >> >>>>>> forward to more. > > > > >> >>>>>> > > > > >> >>>>>> Congrats! > > > > >> >>>>>> Thomas, for the Apache Apex PMC. > > > > >> >>>>>> > > > > >> > > > > >> > > > > > > > > > > -- > > > > > > > > > > Regards, > > > > > Ashwin. > > > > > > > > > > > > > -- > Thanks and Regards, > Shubhrajyoti Mohapatra > Mobile: +91-9769292295 > -- Thanks & Regards Deepak Narkhede
Re: [Feature Proposal] Add metrics dropwizards like gauges, meters, histogram etc to Apex Platform
Thanks all for the suggestions. I'll incorporate all the suggestions/questions by each one of you and come up with concrete design with multiple phases/staging. - Deepak On Thu, May 10, 2018 at 8:22 AM, Ananth G <ananthg.a...@gmail.com> wrote: > +1 for the feature. > > +1 for an abstracted way of metrics library integration. > > Some additional thoughts on this: > > - We might have implications of adding a set of dependencies from drop > wizard into the engine core ( guava versions etc). > - Flink as pointed out by Thomas seems to have taken an interesting > approach to metrics ( apart from supporting multiple reporters ) - Any > reporter is only enabled if the relevant jar is put in the classpath. > - There seems to be two worlds for generating the metric names. 1. Dot > notation separated ( or some separator thereof. Ex: dropwizard style) and > 2. high dimensional metric names using key value pair tags. ( prometheus > style). We might have to "transform" the dot hierarchical notation to a key > value pair based notation based on the reporter that is chosen by the end > user. > - The scope of the implementation is too big as I understand it and perhaps > this feature needs to be done in multiple JIRAs. Also malhar is still at > java-7 while engine is at java-8. > > Some questions: > > - Is there a plan to expose the metric via a jetty end point on each JVM ? > ( if configured ) > - How do we plan to handle dynamic partitioning of operators for metrics ? > ( i.e possibly short-lived JVMs ) > - This feature implies that we will no longer support equivalent of > ComplexType of Autometrics ? > - This feature also implies that we will no longer support autometric > aggregators ? ( and leave this to the metric tools functionality) > > > Regards, > Ananth > > On Thu, May 10, 2018 at 12:37 PM, Chinmay Kolhatkar < > chinmaykolhatka...@gmail.com> wrote: > > > +1 for approach 1. As Thomas mentioned, I think there metrics layer > should > > be abstracted out from its implementation. This way one can plugin > > different metrics systems in apex. > > > > Also keep in mind that there is a lot of code which uses AutoMetrics > > annotations. We should help a smooth transition from that. As next > release > > will be a major version release, this is a good opportunity for getting > rid > > of old AutoMetrics and counters functionality. > > > > Regards, > > Chinmay. > > > > > > On Thu, 10 May 2018, 4:47 am Vlad Rozov, <vro...@apache.org> wrote: > > > > > +1 for the #1 proposal. > > > > > > Thank you, > > > > > > Vlad > > > > > > On 5/9/18 07:14, Thomas Weise wrote: > > > > +1 for the initiative > > > > > > > > Some thoughts: > > > > > > > > - it is probably good to retain a level of abstraction, to avoid > direct > > > > dependency on dropwizard > > > > - support programmatic metric creation (not just annotations) > > > > - remove deprecated counter and auto-metric code and migrate > operators > > to > > > > use new API > > > > - which metric reporting systems will be supported out of the box > > > > > > > > You can also take a look at how this was structured in Flink: > > > > > > > https://ci.apache.org/projects/flink/flink-docs- > release-1.4/monitoring/ > > metrics.html > > > > > > > > Thanks, > > > > Thomas > > > > > > > > > > > > On Tue, May 8, 2018 at 8:56 AM, Deepak Narkhede < > > mailtodeep...@gmail.com > > > > > > > > wrote: > > > > > > > >> Hi Community, > > > >> > > > >> I want to propose addition of metrics like gauges, meters, counters > > and > > > >> historgram for the following components. > > > >> 1) Addition of metrics for Container Stats. > > > >> 2) Addition of metrics for Operator Stats. > > > >> 3) Addition of metrics for Stram Application Master stats. > > > >> 4) Addition of metrics for JVM related stats for all containers. > > > >> > > > >> To implement them would be using metrics dropwizard api's. ( > > > >> http://metrics.dropwizard.io/) > > > >> Use cases: > > > >> 1) Can be directly pushed to external visualisation system like > > > Graphite. > > > >> 2) Can be viewed in visualVM tools through JMX. > > > >> 3) Can be outputted to console. > > >
[Feature Proposal] Add metrics dropwizards like gauges, meters, histogram etc to Apex Platform
Hi Community, I want to propose addition of metrics like gauges, meters, counters and historgram for the following components. 1) Addition of metrics for Container Stats. 2) Addition of metrics for Operator Stats. 3) Addition of metrics for Stram Application Master stats. 4) Addition of metrics for JVM related stats for all containers. To implement them would be using metrics dropwizard api's. ( http://metrics.dropwizard.io/) Use cases: 1) Can be directly pushed to external visualisation system like Graphite. 2) Can be viewed in visualVM tools through JMX. 3) Can be outputted to console. 4) It is also possible to push the metrics to custom sink. We will also need to write sinks and reporter, if required for custom sinks. Design/Implementation approach: Way #1: 1) Create new annotations like @MetricTypeGauge, @MetricTypeMeter, @MetricTypeCounter, @MetricTypeHistogram. They can be both fields and methods. 2) Add them to respective methods or fields like StreamingContainer, StreamingAppMasterService for extraction of relevant metrics. 3) While Node creation ( InputNode/GeneticNode/OiONode), we create and initialise the metrics registry depending on components. 4) While collectMetrics() part of operator runner thread ( InputNode.run /GenericNode.run), we actually invoke the annotations methods and collect different types of metrics. 5) We can have a sink which pushes the metrics to reporter like Console, JMX etc. Way #2: Use existing AutoMetrics annotations, convert some metrics to different types like gauge, counter etc..But this cannot be done generically as we don't know the types. Still more investigation is going on this approach. I would prefer first way. Note: There are some complications, if two operators are deployed on same jvm conatiner. But I think it can be resolved by creating two different metrics registry with unique id from JVM. Let me know your thoughts on this. Thanks, Deepak
[jira] [Commented] (APEXCORE-724) Support for Kubernetes
[ https://issues.apache.org/jira/browse/APEXCORE-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016862#comment-16016862 ] Deepak Narkhede commented on APEXCORE-724: -- Hi Thomas, I would like to contribute to this feature. Also have done some investigation earlier using kubernetes client api and pods (single or multi-docker containers) on kubernertes. Thanks, Deepak > Support for Kubernetes > -- > > Key: APEXCORE-724 > URL: https://issues.apache.org/jira/browse/APEXCORE-724 > Project: Apache Apex Core > Issue Type: New Feature >Reporter: Thomas Weise > Labels: roadmap > > It should be possible to run Apex applications on Kubernetes. This will also > require that Apex applications can be packaged as containers (Docker or other > supported container). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] apex-malhar pull request #594: APEXMALHAR-2460 Redshift output module unable...
Github user deepak-narkhede closed the pull request at: https://github.com/apache/apex-malhar/pull/594 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #601: APEXMALHAR-2431 Create Kinesis Input operator...
GitHub user deepak-narkhede reopened a pull request: https://github.com/apache/apex-malhar/pull/601 APEXMALHAR-2431 Create Kinesis Input operator which emits byte array You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2431 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/601.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #601 commit fd4f61f2aa5c872734427b48b04350a31cee222e Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2017-04-05T08:21:58Z APEXMALHAR-2431 Create Kinesis Input operator which emits byte array as a tuple --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #601: APEXMALHAR-2431 Create Kinesis Input operator...
Github user deepak-narkhede closed the pull request at: https://github.com/apache/apex-malhar/pull/601 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #601: APEXMALHAR-2431 Create Kinesis Input operator...
GitHub user deepak-narkhede opened a pull request: https://github.com/apache/apex-malhar/pull/601 APEXMALHAR-2431 Create Kinesis Input operator which emits byte array You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2431 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/601.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #601 commit fd4f61f2aa5c872734427b48b04350a31cee222e Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2017-04-05T08:21:58Z APEXMALHAR-2431 Create Kinesis Input operator which emits byte array as a tuple --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (APEXMALHAR-2460) Redshift output module tuples unable to emit tuples
[ https://issues.apache.org/jira/browse/APEXMALHAR-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Narkhede updated APEXMALHAR-2460: Summary: Redshift output module tuples unable to emit tuples (was: Redshift output module tuples were unable to emit) > Redshift output module tuples unable to emit tuples > --- > > Key: APEXMALHAR-2460 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2460 > Project: Apache Apex Malhar > Issue Type: Bug > Reporter: Deepak Narkhede > Assignee: Deepak Narkhede > > Issue: The emiting code block was commented for redshift output module > specifically for S3 compaction operator. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (APEXMALHAR-2460) Redshift output module tuples were unable to emit
Deepak Narkhede created APEXMALHAR-2460: --- Summary: Redshift output module tuples were unable to emit Key: APEXMALHAR-2460 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2460 Project: Apache Apex Malhar Issue Type: Bug Reporter: Deepak Narkhede Assignee: Deepak Narkhede Issue: The emiting code block was commented for redshift output module specifically for S3 compaction operator. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (APEXMALHAR-2431) Create Kinesis Input operator which emits byte array as tuple
Deepak Narkhede created APEXMALHAR-2431: --- Summary: Create Kinesis Input operator which emits byte array as tuple Key: APEXMALHAR-2431 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2431 Project: Apache Apex Malhar Issue Type: Improvement Reporter: Deepak Narkhede Assignee: Deepak Narkhede -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] apex-core pull request #478: APEXCORE-634 Changes to unifier attribute test ...
GitHub user deepak-narkhede opened a pull request: https://github.com/apache/apex-core/pull/478 APEXCORE-634 Changes to unifier attribute test case for module. Incorporated comments from Vlad on (https://github.com/apache/apex-core/pull/466) [APEXCORE-634] Apex Platform unable to set unifier attributes for modules You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-core APEXCORE-634 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-core/pull/478.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #478 commit 976d4452660ae6c0b44d19885f375fc0c7fdd126 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2017-02-23T08:08:42Z APEXCORE-634 Changes to unifier attribute test case for module. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #550: APEXMALHAR-2381 Change WindowManager for perf...
Github user deepak-narkhede closed the pull request at: https://github.com/apache/apex-malhar/pull/550 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #550: APEXMALHAR-2381 Change WindowManager for perf...
GitHub user deepak-narkhede reopened a pull request: https://github.com/apache/apex-malhar/pull/550 APEXMALHAR-2381 Change WindowManager for performance issues in Kinesi⦠For performance benefits change existing windowDataManager. If FSWindowManager is required we can very well set through property. Tested with Kinesis to S3 app. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2381 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/550.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #550 commit 4b4e8c418b3a0898166e4abd8bc366f53733e9b0 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2017-02-13T11:08:16Z APEXMALHAR-2381 Change WindowManager for performance issues in Kinesis Input Operator This change contains: 1) Change WindowManager for performance issues in Kinesis Input Operator. 2) Unit test for default WindowDataManger for KinesisInputOperator. 3) Fix for addtion of fasterxml dependency previous all unit test were failing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #550: APEXMALHAR-2381 Change WindowManager for perf...
GitHub user deepak-narkhede reopened a pull request: https://github.com/apache/apex-malhar/pull/550 APEXMALHAR-2381 Change WindowManager for performance issues in Kinesi⦠For performance benefits change existing windowDataManager. If FSWindowManager is required we can very well set through property. Tested with Kinesis to S3 app. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2381 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/550.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #550 commit 4b4e8c418b3a0898166e4abd8bc366f53733e9b0 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2017-02-13T11:08:16Z APEXMALHAR-2381 Change WindowManager for performance issues in Kinesis Input Operator This change contains: 1) Change WindowManager for performance issues in Kinesis Input Operator. 2) Unit test for default WindowDataManger for KinesisInputOperator. 3) Fix for addtion of fasterxml dependency previous all unit test were failing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #550: APEXMALHAR-2381 Change WindowManager for perf...
Github user deepak-narkhede closed the pull request at: https://github.com/apache/apex-malhar/pull/550 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #550: APEXMALHAR-2381 Change WindowManager for perf...
Github user deepak-narkhede closed the pull request at: https://github.com/apache/apex-malhar/pull/550 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #550: APEXMALHAR-2381 Change WindowManager for perf...
GitHub user deepak-narkhede reopened a pull request: https://github.com/apache/apex-malhar/pull/550 APEXMALHAR-2381 Change WindowManager for performance issues in Kinesi⦠For performance benefits change existing windowDataManager. If FSWindowManager is required we can very well set through property. Tested with Kinesis to S3 app. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2381 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/550.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #550 commit 4b4e8c418b3a0898166e4abd8bc366f53733e9b0 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2017-02-13T11:08:16Z APEXMALHAR-2381 Change WindowManager for performance issues in Kinesis Input Operator This change contains: 1) Change WindowManager for performance issues in Kinesis Input Operator. 2) Unit test for default WindowDataManger for KinesisInputOperator. 3) Fix for addtion of fasterxml dependency previous all unit test were failing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #556: APEXMALHAR-2412 Provide emitTuple overriding ...
GitHub user deepak-narkhede opened a pull request: https://github.com/apache/apex-malhar/pull/556 APEXMALHAR-2412 Provide emitTuple overriding functionality for user in kinesis Input operator Provide emitTuple overriding functionality for user in kinesis Input operator. Tested with custom app. Will open another request for custom application as it is part of examples repo. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2412 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/556.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #556 commit e7a24e325534f13e6b3a4131f248ac517a97398b Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2017-02-17T08:46:40Z APEXMALHAR-2412 Provide emitTuple overriding functionality for user in kinesis Input operator --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #555: APEXMALHAR-2411 Avoid isreplaystate variable,...
GitHub user deepak-narkhede opened a pull request: https://github.com/apache/apex-malhar/pull/555 APEXMALHAR-2411 Avoid isreplaystate variable, incorporate logic in activate() and replay() for Kinesis Input Operator Avoid isreplaystate variable, incorporate logic in activate() and replay() for Kinesis Input Operator. Unit test testRecoveryAndIdempotency() passed with the fix initially it used to fail. Also tested with Kinesis to S3 app. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2411 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/555.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #555 commit 0c0a8cf690742fbec3b98d32641bf6bdbaf9a166 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2017-02-17T06:13:05Z APEXMALHAR-2411 Avoid isreplaystate variable, incorporate logic in activate() and replay() for Kinesis Input Operator --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (APEXMALHAR-2411) Avoid isreplaystate variable, incorporate logic in activate() and replay() for Kinesis Input Operator
[ https://issues.apache.org/jira/browse/APEXMALHAR-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Narkhede updated APEXMALHAR-2411: Summary: Avoid isreplaystate variable, incorporate logic in activate() and replay() for Kinesis Input Operator (was: Avoid consumer start while replay state in KinesisInputOperator) > Avoid isreplaystate variable, incorporate logic in activate() and replay() > for Kinesis Input Operator > - > > Key: APEXMALHAR-2411 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2411 > Project: Apache Apex Malhar > Issue Type: Improvement > Reporter: Deepak Narkhede > Assignee: Deepak Narkhede > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (APEXMALHAR-2412) Provide emitTuple overriding functionality for user in kinesis Input operator
Deepak Narkhede created APEXMALHAR-2412: --- Summary: Provide emitTuple overriding functionality for user in kinesis Input operator Key: APEXMALHAR-2412 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2412 Project: Apache Apex Malhar Issue Type: Improvement Reporter: Deepak Narkhede -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (APEXMALHAR-2411) Avoid consumer start while replay state in KinesisInputOperator
Deepak Narkhede created APEXMALHAR-2411: --- Summary: Avoid consumer start while replay state in KinesisInputOperator Key: APEXMALHAR-2411 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2411 Project: Apache Apex Malhar Issue Type: Improvement Reporter: Deepak Narkhede Assignee: Deepak Narkhede -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] apex-core pull request #466: APEXCORE-634 Apex Platform unable to set unifie...
GitHub user deepak-narkhede reopened a pull request: https://github.com/apache/apex-core/pull/466 APEXCORE-634 Apex Platform unable to set unifier attributes for modules Problem: Unable to set Unifier attributes of output port within modules Description: When modules are flatten in logical Plan of DAG, only top level attributes are cloned of OperatorMeta. Unifier attributes are not copied in PortMapping for output ports. Solution:Clone the unifier attributes while flattening DAG in logical Plan. Testing done for custom application with and without modules also with app template HDFS to S3 module. Have had debug logs and stack traces to verify the fix. One of the snip of logs had set the unifier attribute TIMEOUT_WINDOW_COUNT to 10 ( i.e 1000 millisec) **Without fix:** 2017-02-06 15:51:31,539 WARN com.datatorrent.stram.StreamingContainerManager: UNIFIER operator PTOperator[id=4,name=**genmodule$gen.out#unifier]** committed window , recovery window , current time 1486376491539, last window id change time 0, **window processing timeout millis 6** **With Fix:** 2017-02-06 14:22:49,602 WARN com.datatorrent.stram.StreamingContainerManager: UNIFIER operator PTOperator[id=4,name=**genmodule$gen.out#unifier]** committed window , recovery window , current time 1486371169602, last window id change time 0, **window processing timeout millis 1000** You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-core APEXCORE-634 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-core/pull/466.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #466 commit 12ce7c5d8082289c4690d70b09e65033090c44e1 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2017-02-07T11:09:07Z APEXCORE-634 Apex Platform unable to set unifier attributes for modules in DAG. Problem: Unable to set Unifier attributes of output port within modules Description: When modules are flatten in logical Plan of DAG, only top level attributes are cloned of OperatorMeta. Unifier attributes are not copied in PortMapping for output ports. Solution:Clone the unifier attributes while flattening DAG in logical Plan. commit 5df34a81670520cedc74b1ac3f14d1a2d02ca203 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2017-02-15T16:09:23Z APEXCORE-634 Added unit test for testing unifier attribute for module. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-core pull request #466: APEXCORE-634 Apex Platform unable to set unifie...
Github user deepak-narkhede closed the pull request at: https://github.com/apache/apex-core/pull/466 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #550: APEXMALHAR-2381 Change WindowManager for perf...
GitHub user deepak-narkhede opened a pull request: https://github.com/apache/apex-malhar/pull/550 APEXMALHAR-2381 Change WindowManager for performance issues in Kinesi⦠For performance benefits change existing windowDataManager. If FSWindowManager is required we can very well set through property. Tested with Kinesis to S3 app. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2381 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/550.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #550 commit c4ea761201eb747e48e21fd3c44b0c7982cc097b Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2017-02-13T11:08:16Z APEXMALHAR-2381 Change WindowManager for performance issues in Kinesis Input Operator --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #549: APEXMALHAR-2380 Add MutablePair for Kinensis ...
GitHub user deepak-narkhede opened a pull request: https://github.com/apache/apex-malhar/pull/549 APEXMALHAR-2380 Add MutablePair for Kinensis Operator for Recovery State Add MutablePair for Kinensis Operator for Recovery State which will avoid default serialization caused by KinesisPair. Tested with Kinesis to S3 app. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2380 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/549.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #549 commit c7185546539a800a3a277e86ad3d400bd58ac8d8 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2017-01-05T09:37:56Z APEXMALHAR-2380 Add MutablePair for Kinensis Operator for Recovery State --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (APEXCORE-634) Apex Platform unable to set unifier attributes for modules in DAG
[ https://issues.apache.org/jira/browse/APEXCORE-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855811#comment-15855811 ] Deepak Narkhede commented on APEXCORE-634: -- Hi Ram, After further investigation the problem was with unifier attributes of operators within the modules when the DAG is flatten. Problem: Unable to set Unifier attributes of output port within modules Description: When modules are flatten in logical Plan of DAG, only top level attributes are cloned of OperatorMeta. Unifier attributes are not copied in PortMapping for output ports. Solution: Clone the unifier attributes while flattening DAG in logical Plan. Thanks, Deepak > Apex Platform unable to set unifier attributes for modules in DAG > - > > Key: APEXCORE-634 > URL: https://issues.apache.org/jira/browse/APEXCORE-634 > Project: Apache Apex Core > Issue Type: Bug > Reporter: Deepak Narkhede > Assignee: Deepak Narkhede > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (APEXCORE-634) Apex Platform unable to set unifier attributes for modules in DAG
[ https://issues.apache.org/jira/browse/APEXCORE-634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Narkhede updated APEXCORE-634: - Summary: Apex Platform unable to set unifier attributes for modules in DAG (was: Apex Platform unable to set Unifier attribute window time out ) > Apex Platform unable to set unifier attributes for modules in DAG > - > > Key: APEXCORE-634 > URL: https://issues.apache.org/jira/browse/APEXCORE-634 > Project: Apache Apex Core > Issue Type: Bug > Reporter: Deepak Narkhede > Assignee: Deepak Narkhede > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] apex-core pull request #466: APEXCORE-634 Apex Platform unable to set unifie...
GitHub user deepak-narkhede opened a pull request: https://github.com/apache/apex-core/pull/466 APEXCORE-634 Apex Platform unable to set unifier attributes for modules Problem: Unable to set Unifier attributes of output port within modules Description: When modules are flatten in logical Plan of DAG, only top level attributes are cloned of OperatorMeta. Unifier attributes are not copied in PortMapping for output ports. Solution:Clone the unifier attributes while flattening DAG in logical Plan. Testing done for custom application with and without modules also with app template HDFS to S3 module. Have had debug logs and stack traces to verify the fix. One of the snip of logs had set the unifier attribute TIMEOUT_WINDOW_COUNT to 10 ( i.e 1000 millisec) **Without fix:** 2017-02-06 15:51:31,539 WARN com.datatorrent.stram.StreamingContainerManager: UNIFIER operator PTOperator[id=4,name=**genmodule$gen.out#unifier]** committed window , recovery window , current time 1486376491539, last window id change time 0, **window processing timeout millis 6** **With Fix:** 2017-02-06 14:22:49,602 WARN com.datatorrent.stram.StreamingContainerManager: UNIFIER operator PTOperator[id=4,name=**genmodule$gen.out#unifier]** committed window , recovery window , current time 1486371169602, last window id change time 0, **window processing timeout millis 1000** You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-core APEXCORE-634 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-core/pull/466.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #466 commit 3ab8916058e3d5ccaecb9dc84c9a96d6ab22ebec Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2017-02-07T11:09:07Z APEXCORE-634 Apex Platform unable to set unifier attributes for modules in DAG. Problem: Unable to set Unifier attributes of output port within modules Description: When modules are flatten in logical Plan of DAG, only top level attributes are cloned of OperatorMeta. Unifier attributes are not copied in PortMapping for output ports. Solution:Clone the unifier attributes while flattening DAG in logical Plan. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (APEXCORE-634) Apex Platform unable to set Unifier attribute window time out
Deepak Narkhede created APEXCORE-634: Summary: Apex Platform unable to set Unifier attribute window time out Key: APEXCORE-634 URL: https://issues.apache.org/jira/browse/APEXCORE-634 Project: Apache Apex Core Issue Type: Bug Reporter: Deepak Narkhede Assignee: Deepak Narkhede -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (APEXCORE-621) populate TIMEOUT_WINDOW_COUNT for thread local operators from downstreams.
[ https://issues.apache.org/jira/browse/APEXCORE-621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Narkhede reassigned APEXCORE-621: Assignee: Deepak Narkhede > populate TIMEOUT_WINDOW_COUNT for thread local operators from downstreams. > -- > > Key: APEXCORE-621 > URL: https://issues.apache.org/jira/browse/APEXCORE-621 > Project: Apache Apex Core > Issue Type: Improvement >Reporter: Tushar Gosavi > Assignee: Deepak Narkhede > > A -> B -> C -> D > In above dag if we have set TIMEOUT_WINDOW_COUNT on 'C' and 'B' and 'C' are > in thread local, then 'B' uses default TIMEOUT_WINDOW_COUNT attribute and > marked as blocked opeator while C is performing a time cosuming operation. > The problem is more visible when operator B is partitioned and unifiers are > deployed thread local to C, in this case unifiers are declared are blocked, > and users need to remember to set TIMEOUT_WINDOW_COUNT on unifiers. > Instead platform could inherit TIMEOUT_WINDOW_COUNT attribute from downstream > operator in case of threadlocal/container local case to avoid getting > detected as blocked early. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXMALHAR-2380) Add MutablePair for Kinensis Operator for Recovery State
Deepak Narkhede created APEXMALHAR-2380: --- Summary: Add MutablePair for Kinensis Operator for Recovery State Key: APEXMALHAR-2380 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2380 Project: Apache Apex Malhar Issue Type: Improvement Reporter: Deepak Narkhede Assignee: Deepak Narkhede -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXMALHAR-2375) Latest Apex Malhar build failure with Apex core version 3.5.0
[ https://issues.apache.org/jira/browse/APEXMALHAR-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15769155#comment-15769155 ] Deepak Narkhede commented on APEXMALHAR-2375: - Hi Thomas, I had some code in Apex Core to be tested on latest Apex Malhar. Hence I found the issue. Anyways, Thanks Thomas for clarification! - Deepak > Latest Apex Malhar build failure with Apex core version 3.5.0 > - > > Key: APEXMALHAR-2375 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2375 > Project: Apache Apex Malhar > Issue Type: Bug > Reporter: Deepak Narkhede > Assignee: Deepak Narkhede > > Latest Apex Malhar build fails with Apex version 3.5.0 reason: > OperatorContext Interface has been updated with method getName(), And as some > operators in malhar implements this interface fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Is there any plan to upgrade Apex core version for Apex Malhar ?
Dear Community, The reason I'm asking because the latest branch of Apex Malhar build fails with new Apex core version. Reason: OperatorContext Interface (from datatorrent.api) has been updated with method getName(), And as some operators in malhar implements this interface hence the build fails. Mostly they are used in test scenario. Let me know If there are plan to upgrade Apex core version, would like to take it up. Also opened the jira for the same: APEXMALHAR-2375 -- Thanks & Regards Deepak Narkhede
[jira] [Created] (APEXMALHAR-2375) Latest Apex Malhar build failure with Apex core version 3.5.0
Deepak Narkhede created APEXMALHAR-2375: --- Summary: Latest Apex Malhar build failure with Apex core version 3.5.0 Key: APEXMALHAR-2375 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2375 Project: Apache Apex Malhar Issue Type: Bug Reporter: Deepak Narkhede Assignee: Deepak Narkhede Latest Apex Malhar build fails with Apex version 3.5.0 reason: OperatorContext Interface has been updated with method getName(), And as some operators in malhar implements this interface fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXMALHAR-2348) Mock tests for PostgreSQL for JdbcPollInputOperator
Deepak Narkhede created APEXMALHAR-2348: --- Summary: Mock tests for PostgreSQL for JdbcPollInputOperator Key: APEXMALHAR-2348 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2348 Project: Apache Apex Malhar Issue Type: Bug Reporter: Deepak Narkhede Assignee: Deepak Narkhede As embedded drivers are not present so need to mock the PostgreSQL database connections. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXMALHAR-2349) Mock tests for PostgreSQL for JdbcPollInputOperator
Deepak Narkhede created APEXMALHAR-2349: --- Summary: Mock tests for PostgreSQL for JdbcPollInputOperator Key: APEXMALHAR-2349 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2349 Project: Apache Apex Malhar Issue Type: Bug Reporter: Deepak Narkhede Assignee: Deepak Narkhede As embedded drivers are not present so need to mock the PostgreSQL database connections. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #489: APEXMALHAR-2330 JdbcPOJOPollInputOperator fai...
Github user deepak-narkhede closed the pull request at: https://github.com/apache/apex-malhar/pull/489 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #489: APEXMALHAR-2330 JdbcPOJOPollInputOperator fai...
GitHub user deepak-narkhede reopened a pull request: https://github.com/apache/apex-malhar/pull/489 APEXMALHAR-2330 JdbcPOJOPollInputOperator fails with NullPointerExcep⦠**Problem:** JdbcPOJOPollInputOperator fails with NullPointerException using PostgreSQL driver. **Problem Description:** 1) When JdbcPOJOPollInputOperator tries to populateColumnDataTypes, column names retrieved from resultmetadata from database ( this case : Postgres) are all lowercase. 2) Whereas columnDatatypes specified in fieldinfos might be in same case. 3) Internally hashmap ( nameToType) is used which mismatches if column name and fieldinfo are not in same case. Hence columnDataTypes is empty which causes null exception in activate call. **Solution:** Using similar case for hashmap and column names irrespective of any database used for JdbcPOJOPollInputOperator. **Testing:** Tested with Database-to-hdfs app also with instrumented logs. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2330 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/489.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #489 commit 58812c41bcff437db97853e5c5b628a9f9f5dbc4 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2016-11-09T13:37:10Z APEXMALHAR-2330 JdbcPOJOPollInputOperator fails with NullPointerException when PostgreSQL driver. Using similar case for hashmap and column names irrespective of any database used for JdbcPOJOPollInputOperator. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #489: APEXMALHAR-2330 JdbcPOJOPollInputOperator fai...
GitHub user deepak-narkhede opened a pull request: https://github.com/apache/apex-malhar/pull/489 APEXMALHAR-2330 JdbcPOJOPollInputOperator fails with NullPointerExcep⦠**Problem:** JdbcPOJOPollInputOperator fails with NullPointerException when PostgreSQL driver. **Problem Description:** 1) When JdbcPOJOPollInputOperator tries to populateColumnDataTypes, column names retrieved from resultmetadata from database ( this case : Postgres) are all lowercase. 2) Whereas columnDatatypes specified in fieldinfos might be in same case. 3) Internally hashmap ( nameToType) is used which mismatches if column name and fieldinfo are not in same case. Hence columnDataTypes is empty which causes null exception in activate call. **Solution:** Using similar case for hashmap and column names irrespective of any database used for JdbcPOJOPollInputOperator. **Testing:** Tested with Database-to-hdfs app also with instrumented logs. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2330 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/489.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #489 commit 58812c41bcff437db97853e5c5b628a9f9f5dbc4 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2016-11-09T13:37:10Z APEXMALHAR-2330 JdbcPOJOPollInputOperator fails with NullPointerException when PostgreSQL driver. Using similar case for hashmap and column names irrespective of any database used for JdbcPOJOPollInputOperator. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (APEXMALHAR-2330) JdbcPOJOPollInputOperator fails with NullPointerException when PostgreSQL driver
Deepak Narkhede created APEXMALHAR-2330: --- Summary: JdbcPOJOPollInputOperator fails with NullPointerException when PostgreSQL driver Key: APEXMALHAR-2330 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2330 Project: Apache Apex Malhar Issue Type: Bug Reporter: Deepak Narkhede Assignee: Deepak Narkhede Here is the description: Problem: JdbcPOJOPollInputOperator fails with NullPointerException when PostgreSQL driver. Problem Description: 1) When JdbcPOJOPollInputOperator tries to populateColumnDataTypes, column names retrieved from resultmetadata from database ( this case : Postgres) are all lowercase. 2) Whereas columnDatatypes specified in fieldinfos might be in same case. 3) Internally hashmap ( nameToType) is used which mismatches if column name and fieldinfo are not in same case. Hence columnDataTypes is empty which causes null exception in activate call. Proposed Solution: Using similar case for hashmap and column names irrespective of any database used for JdbcPOJOPollInputOperator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #468: APEXMALHAR-2314 Improper functioning in parti...
GitHub user deepak-narkhede reopened a pull request: https://github.com/apache/apex-malhar/pull/468 APEXMALHAR-2314 Improper functioning in partitioning for sequentialFileRead for FSRecord Fix the StreamCodec for FSRecordReader, initially it was hashcode of blockId's mostly always unique. Hence unable to satisfy the sequentialFileRead property. Now the StreamCodec is modified to work with hashcode of filePath. So all blocks related to a file would be partitioned on same operator. Tested with recordReader and verified for sequentialFileRead that all blocks related to a file are partitioned to single operator. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2314 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/468.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #468 commit 3f973043f5d343bcf7cb067269377e4e08c76aff Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2016-11-07T09:44:47Z APEXMALHAR-2314 Improper functioning in partitioning of sequentialFileRead property of FSRecordReaderModule. Modified the StreamCodec to work with hashcode of filepath rather than blockId. Conflicts: library/src/main/java/org/apache/apex/malhar/lib/fs/FSRecordReaderModule.java --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #468: APEXMALHAR-2314 Improper functioning in parti...
Github user deepak-narkhede closed the pull request at: https://github.com/apache/apex-malhar/pull/468 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (APEXMALHAR-2319) Documentation for FSRecordReader Module for line by line reader.
Deepak Narkhede created APEXMALHAR-2319: --- Summary: Documentation for FSRecordReader Module for line by line reader. Key: APEXMALHAR-2319 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2319 Project: Apache Apex Malhar Issue Type: Documentation Reporter: Deepak Narkhede Assignee: Deepak Narkhede -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (APEXCORE-567) Documentation for FSRecordReader Module for line by line reader.
[ https://issues.apache.org/jira/browse/APEXCORE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Narkhede closed APEXCORE-567. Resolution: Not A Bug > Documentation for FSRecordReader Module for line by line reader. > > > Key: APEXCORE-567 > URL: https://issues.apache.org/jira/browse/APEXCORE-567 > Project: Apache Apex Core > Issue Type: Documentation > Reporter: Deepak Narkhede > Assignee: Deepak Narkhede > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXCORE-567) Documentation for FSRecordReader Module for line by line reader.
Deepak Narkhede created APEXCORE-567: Summary: Documentation for FSRecordReader Module for line by line reader. Key: APEXCORE-567 URL: https://issues.apache.org/jira/browse/APEXCORE-567 Project: Apache Apex Core Issue Type: Documentation Reporter: Deepak Narkhede Assignee: Deepak Narkhede -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXMALHAR-2312) NullPointerException in FileSplitterInput only if the file path is specified for attribute instead of directory path
[ https://issues.apache.org/jira/browse/APEXMALHAR-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15611295#comment-15611295 ] Deepak Narkhede commented on APEXMALHAR-2312: - Changed to Critical because some basic functionality is broken. > NullPointerException in FileSplitterInput only if the file path is specified > for attribute instead of directory path > > > Key: APEXMALHAR-2312 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2312 > Project: Apache Apex Malhar > Issue Type: Bug > Reporter: Deepak Narkhede > Assignee: Deepak Narkhede >Priority: Critical > > Problem Statement: > == > NullPointerException seen in FileSplitterInput only if the file path is > specified for attribute instead of directory path. > Description: > === > 1) TimeBasedDirectoryScanner threads part of scanservice tries to scan the > directories/files. > 2) Each thread checks with help of isIterationCompleted() [referenceTimes] > method whether scanned of last iteration are processed by operator thread. > 3) Previously it used to work because HashMap (referenceTimes) used to return > null even if last scanned directory path is null. > 4) Recently referenceTimes is changed to ConcurrentHashMap, so get() doesn't > allow null key's passed to ConcurrentHashMap get() method. > 5) Hence NullPointerException is seen as if only file path is provided > directory path would be empty hence key would be empty. > Solution: > > Pre-check that directory path is null then we have completed last iterations > if only filepath is provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXMALHAR-2312) NullPointerException in FileSplitterInput only if the file path is specified for attribute instead of directory path
[ https://issues.apache.org/jira/browse/APEXMALHAR-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Narkhede updated APEXMALHAR-2312: Priority: Critical (was: Minor) > NullPointerException in FileSplitterInput only if the file path is specified > for attribute instead of directory path > > > Key: APEXMALHAR-2312 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2312 > Project: Apache Apex Malhar > Issue Type: Bug > Reporter: Deepak Narkhede > Assignee: Deepak Narkhede >Priority: Critical > > Problem Statement: > == > NullPointerException seen in FileSplitterInput only if the file path is > specified for attribute instead of directory path. > Description: > === > 1) TimeBasedDirectoryScanner threads part of scanservice tries to scan the > directories/files. > 2) Each thread checks with help of isIterationCompleted() [referenceTimes] > method whether scanned of last iteration are processed by operator thread. > 3) Previously it used to work because HashMap (referenceTimes) used to return > null even if last scanned directory path is null. > 4) Recently referenceTimes is changed to ConcurrentHashMap, so get() doesn't > allow null key's passed to ConcurrentHashMap get() method. > 5) Hence NullPointerException is seen as if only file path is provided > directory path would be empty hence key would be empty. > Solution: > > Pre-check that directory path is null then we have completed last iterations > if only filepath is provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXMALHAR-2318) Directory scanner threads unable to throw exceptions to parent in FileSplitter
Deepak Narkhede created APEXMALHAR-2318: --- Summary: Directory scanner threads unable to throw exceptions to parent in FileSplitter Key: APEXMALHAR-2318 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2318 Project: Apache Apex Malhar Issue Type: Bug Reporter: Deepak Narkhede Assignee: Deepak Narkhede -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #468: APEXMALHAR-2314 Improper functioning in parti...
GitHub user deepak-narkhede reopened a pull request: https://github.com/apache/apex-malhar/pull/468 APEXMALHAR-2314 Improper functioning in partitioning for sequentialFileRead for FSRecord Fix the StreamCodec for FSRecordReader, initially it was hashcode of blockId's mostly always unique. Hence unable to satisfy the sequentialFileRead property. Now the StreamCodec is modified to work with hashcode of filePath. So all blocks related to a file would be partitioned on same operator. Tested with recordReader and verified for sequentialFileRead that all blocks related to a file are partitioned to single operator. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2314 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/468.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #468 commit 259cc5b80635207e8b0a4d7c0c9b5bc735021de2 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2016-10-24T11:39:24Z APEXMALHAR-2314 Improper functioning in partitioning of sequentialFileRead property of FSRecordReaderModule. Modified the StreamCodec to work with hashcode of filepath rather than blockId. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #468: APEXMALHAR-2314 Improper functioning in parti...
Github user deepak-narkhede closed the pull request at: https://github.com/apache/apex-malhar/pull/468 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #457: APEXMALHAR-2302 Exposing few properties of FS...
GitHub user deepak-narkhede reopened a pull request: https://github.com/apache/apex-malhar/pull/457 APEXMALHAR-2302 Exposing few properties of FSSplitter and BlockReader operators to FSRecordReaderModule This change includes: 1) Expose blockSize property of FileSplitter operator. 2) Expose minReaders and maxReaders for dynamic partitioning of Block Reader operator. 3) Deprecate readersCount from FSRecordReaderModule. Tested with recordReader application. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2302 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/457.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #457 commit 03e59343fa69b71519f7933b70009994420c6ac0 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2016-10-21T12:27:01Z APEXMALHAR-2302 Exposing the few properties of FSSplitter and BlockReader operators to FSRecordReaderModule to tune Application. 1) Expose blockSize property of FileSplitter operator. 2) Expose minReaders and maxReaders for dynamic partitioning of Block Reader operator. 3) Deprecate readersCount from FSRecordReaderModule. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #468: APEXMALHAR-2314 Improper functioning in parti...
GitHub user deepak-narkhede reopened a pull request: https://github.com/apache/apex-malhar/pull/468 APEXMALHAR-2314 Improper functioning in partitioning for sequentialFileRead for FSRecord Fix the StreamCodec for FSRecordReader, initially it was hashcode of blockId's mostly always unique. Hence unable to satisfy the sequentialFileRead property. Now the StreamCodec is modified to work with hashcode of filePath. So all blocks related to a file would be partitioned on same operator. Tested with recordReader and verified for sequentialFileRead that all blocks related to a file are partitioned to single operator. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2314 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/468.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #468 commit 259cc5b80635207e8b0a4d7c0c9b5bc735021de2 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2016-10-24T11:39:24Z APEXMALHAR-2314 Improper functioning in partitioning of sequentialFileRead property of FSRecordReaderModule. Modified the StreamCodec to work with hashcode of filepath rather than blockId. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (APEXMALHAR-2314) Improper functioning in partitioning of sequentialFileRead property of FSRecordReader
[ https://issues.apache.org/jira/browse/APEXMALHAR-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Narkhede updated APEXMALHAR-2314: Description: Fix the StreamCodec for FSRecordReader, initially it was hashcode of blockId's mostly always unique. Hence unable to satisfy the sequentialFileRead property. Now the StreamCodec is modified to work with hashcode of filePath. So all blocks related to a file would be partitioned on same operator. was: Fix the StreamCodec for FSRecordReader, initially it was hashcode of blockId's mostly always unique. Hence unable to satisfy the sequencialFileRead property. Now the StreamCodec is modified to work with hashcode of filePath. So all blocks related to a file would be partitioned on same operator. > Improper functioning in partitioning of sequentialFileRead property of > FSRecordReader > -- > > Key: APEXMALHAR-2314 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2314 > Project: Apache Apex Malhar > Issue Type: Bug > Reporter: Deepak Narkhede > Assignee: Deepak Narkhede >Priority: Minor > > Fix the StreamCodec for FSRecordReader, initially it was hashcode of > blockId's mostly always unique. > Hence unable to satisfy the sequentialFileRead property. Now the StreamCodec > is modified to work > with hashcode of filePath. So all blocks related to a file would be > partitioned on same operator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXMALHAR-2314) Improper functioning in partitioning of sequentialFileRead property of FSRecordReader
[ https://issues.apache.org/jira/browse/APEXMALHAR-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Narkhede updated APEXMALHAR-2314: Summary: Improper functioning in partitioning of sequentialFileRead property of FSRecordReader (was: Improper functioning in partitioning of sequencialFileRead property of FSRecordReader ) > Improper functioning in partitioning of sequentialFileRead property of > FSRecordReader > -- > > Key: APEXMALHAR-2314 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2314 > Project: Apache Apex Malhar > Issue Type: Bug > Reporter: Deepak Narkhede > Assignee: Deepak Narkhede >Priority: Minor > > Fix the StreamCodec for FSRecordReader, initially it was hashcode of > blockId's mostly always unique. > Hence unable to satisfy the sequencialFileRead property. Now the StreamCodec > is modified to work > with hashcode of filePath. So all blocks related to a file would be > partitioned on same operator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXMALHAR-2314) Improper functioning in partitioning of sequencialFileRead property of FSRecordReader
[ https://issues.apache.org/jira/browse/APEXMALHAR-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Narkhede updated APEXMALHAR-2314: Description: Fix the StreamCodec for FSRecordReader, initially it was hashcode of blockId's mostly always unique. Hence unable to satisfy the sequencialFileRead property. Now the StreamCodec is modified to work with hashcode of filePath. So all blocks related to a file would be partitioned on same operator. > Improper functioning in partitioning of sequencialFileRead property of > FSRecordReader > -- > > Key: APEXMALHAR-2314 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2314 > Project: Apache Apex Malhar > Issue Type: Bug > Reporter: Deepak Narkhede > Assignee: Deepak Narkhede >Priority: Minor > > Fix the StreamCodec for FSRecordReader, initially it was hashcode of > blockId's mostly always unique. > Hence unable to satisfy the sequencialFileRead property. Now the StreamCodec > is modified to work > with hashcode of filePath. So all blocks related to a file would be > partitioned on same operator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #468: APEXMALHAR-2314 Improper functioning in parti...
GitHub user deepak-narkhede opened a pull request: https://github.com/apache/apex-malhar/pull/468 APEXMALHAR-2314 Improper functioning in partitioning for sequencialFileRead for FSRecord Fix the StreamCodec for FSRecordReader, initially it was hashcode of blockId's mostly always unique. Hence unable to satisfy the sequencialFileRead property. Now the StreamCodec is modified to work with hashcode of filePath. So all blocks related to a file would be partitioned on same operator. Tested with recordReader and verified for sequencialFileRead that all blocks related to a file are partitioned to single operator. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2314 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/468.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #468 commit fe81d79d2b5ec845b97a93a5cd21514eff5f29fa Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2016-10-24T08:23:57Z APEXMALHAR-2314 Improper functioning in partitioning of sequencialFileRead property of FSRecordReaderModule. Fix the StreamCodec for FSRecordReader, initially it was hashcode of blockId's mostly always unique. Hence unable to satisfy the sequencialFileRead property. Now the StreamCodec is modified to work with hashcode of filePath. So all blocks related to a file would be partitioned on same operator. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #457: APEXMALHAR-2302 Exposing few properties of FS...
GitHub user deepak-narkhede reopened a pull request: https://github.com/apache/apex-malhar/pull/457 APEXMALHAR-2302 Exposing few properties of FSSplitter and BlockReader operators to FSRecordReaderModule This change adds blockSize property from FileSplitter to FSRecordReaderModule. Tested with RecordReader Application. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2302 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/457.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #457 commit 7d07dd32c95546a6c4570453163f9ff47b8a7893 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2016-10-21T12:27:01Z APEXMALHAR-2302 Exposing the few properties of FSSplitter and BlockReader operators to FSRecordReaderModule to tune Application. This change includes: 1) Expose blockSize property of FileSplitter operator. 2) Expose minReaders and maxReaders for dynamic partitioning of Block Reader operator. 3) Deprecate readersCount from FSRecordReaderModule. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #463: APEXMALHAR-2312 Fix NullPointerException for ...
GitHub user deepak-narkhede reopened a pull request: https://github.com/apache/apex-malhar/pull/463 APEXMALHAR-2312 Fix NullPointerException for FileSplitterInput Operat⦠Problem Statement: - NullPointerException seen in FileSplitterInput only if the file path is specified for attribute instead of directory path. Description: --- 1) TimeBasedDirectoryScanner threads part of scanservice tries to scan the directories/files. 2) Each thread checks with help of isIterationCompleted() [referenceTimes] method whether scanned of last iteration are processed by operator thread. 3) Previously it used to work because HashMap (referenceTimes) used to return null even if last scanned directory path is null. 4) Recently referenceTimes is changed to ConcurrentHashMap, so get() doesn't allow null key's passed to ConcurrentHashMap get() method. 5) Hence NullPointerException is seen as if only file path is provided directory path would be empty hence key would be empty. Solution: --- Pre-check that directory path is null then we have completed last iterations if only filepath is provided. Testing logs with fix for files/directories/sub-directories: - 2016-10-21 11:20:38,382 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: Directory path: /user/deepak/files Sub-Directory or File path: /user/deepak/files/CustomerTxnData2 2016-10-21 11:20:38,382 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: Scan started for input /user/deepak/files 2016-10-21 11:20:38,386 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: scan /user/deepak/files 2016-10-21 11:20:33,372 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: discovered /user/deepak/files/CustomerTxnData 1477028632605 2016-10-21 11:20:33,372 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: discovered /user/deepak/files/CustomerTxnData1 1477028642067 2016-10-21 11:20:33,373 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: discovered /user/deepak/files/CustomerTxnData2 1477028645290 2016-10-21 11:20:33,373 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: scan complete 0 3 2016-10-21 11:25:50,697 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: Directory path: null Sub-Directory or File path: /user/deepak/files/CustomerTxnData 2016-10-21 11:25:50,697 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: Scan started for input /user/deepak/files/CustomerTxnData 2016-10-21 11:25:50,702 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: scan /user/deepak/files/CustomerTxnData 2016-10-21 11:25:50,704 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: scan complete You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2312 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/463.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #463 commit 47f29f39393a4e43c8423153d32d12c9622872b5 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2016-10-21T06:44:34Z APEXMALHAR-2312 Fix NullPointerException for FileSplitterInput Operator if filepath is specified. Problem Description: --- 1) TimeBasedDirectoryScanner threads part of scanservice tries to scan the directories/files. 2) Each thread checks with help of isIterationCompleted() [referenceTimes] method whether scanned of last iteration are processed by operator thread. 3) Previously it used to work because HashMap (referenceTimes) used to return null even if last scanned directory path is null. 4) Recently referenceTimes is changed to ConcurrentHashMap, so get() doesn't allow null key's passed to ConcurrentHashMap get() method. 5) Hence NullPointerException is seen as if only file path is provided directory path would be empty hence key would be empty. Solution: - Pre-check that directory path is null then we have completed last iterations if only filepath is provided. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #463: APEXMALHAR-2312 Fix NullPointerException for ...
Github user deepak-narkhede closed the pull request at: https://github.com/apache/apex-malhar/pull/463 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (APEXMALHAR-2302) Exposing few properties of FSSplitter and BlockReader operators to FSRecordReaderModule to tune Application
[ https://issues.apache.org/jira/browse/APEXMALHAR-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Narkhede updated APEXMALHAR-2302: Summary: Exposing few properties of FSSplitter and BlockReader operators to FSRecordReaderModule to tune Application (was: Add blockSize property to FSRecordReaderModule) > Exposing few properties of FSSplitter and BlockReader operators to > FSRecordReaderModule to tune Application > --- > > Key: APEXMALHAR-2302 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2302 > Project: Apache Apex Malhar > Issue Type: Improvement > Reporter: Deepak Narkhede > Assignee: Deepak Narkhede >Priority: Minor > > Exposing the blockSize property of FSSplitter operator to > FSRecordReaderModule. This will help end users to tune the blockSize value > based on application needs -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #457: APEXMALHAR-2302 Add blockSize property to FSR...
Github user deepak-narkhede closed the pull request at: https://github.com/apache/apex-malhar/pull/457 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #457: APEXMALHAR-2302 Add blockSize property to FSR...
GitHub user deepak-narkhede reopened a pull request: https://github.com/apache/apex-malhar/pull/457 APEXMALHAR-2302 Add blockSize property to FSRecordReaderModule This change adds blockSize property from FileSplitter to FSRecordReaderModule. Tested with RecordReader Application. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2302 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/457.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #457 commit 4d52444f40f8e5bd1eb99bb54ca49574439202d4 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2016-10-17T16:19:27Z APEXMALHAR-2302 Add blockSize property to FSRecordReaderModule. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #463: APEXMALHAR-2312 Fix NullPointerException for ...
GitHub user deepak-narkhede opened a pull request: https://github.com/apache/apex-malhar/pull/463 APEXMALHAR-2312 Fix NullPointerException for FileSplitterInput Operat⦠Problem Statement: - NullPointerException seen in FileSplitterInput only if the file path is specified for attribute instead of directory path. Description: --- 1) TimeBasedDirectoryScanner threads part of scanservice tries to scan the directories/files. 2) Each thread checks with help of isIterationCompleted() [referenceTimes] method whether scanned of last iteration are processed by operator thread. 3) Previously it used to work because HashMap (referenceTimes) used to return null even if last scanned directory path is null. 4) Recently referenceTimes is changed to ConcurrentHashMap, so get() doesn't allow null key's passed to ConcurrentHashMap get() method. 5) Hence NullPointerException is seen as if only file path is provided directory path would be empty hence key would be empty. Solution: --- Pre-check that directory path is null then we have completed last iterations if only filepath is provided. Testing logs with fix for files/directories/sub-directories: - 2016-10-21 11:20:38,382 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: Directory path: /user/deepak/files Sub-Directory or File path: /user/deepak/files/CustomerTxnData2 2016-10-21 11:20:38,382 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: Scan started for input /user/deepak/files 2016-10-21 11:20:38,386 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: scan /user/deepak/files 2016-10-21 11:20:33,372 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: discovered /user/deepak/files/CustomerTxnData 1477028632605 2016-10-21 11:20:33,372 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: discovered /user/deepak/files/CustomerTxnData1 1477028642067 2016-10-21 11:20:33,373 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: discovered /user/deepak/files/CustomerTxnData2 1477028645290 2016-10-21 11:20:33,373 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: scan complete 0 3 2016-10-21 11:25:50,697 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: Directory path: null Sub-Directory or File path: /user/deepak/files/CustomerTxnData 2016-10-21 11:25:50,697 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: Scan started for input /user/deepak/files/CustomerTxnData 2016-10-21 11:25:50,702 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: scan /user/deepak/files/CustomerTxnData 2016-10-21 11:25:50,704 DEBUG com.datatorrent.lib.io.fs.FileSplitterInput: scan complete You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2312 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/463.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #463 commit 47f29f39393a4e43c8423153d32d12c9622872b5 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2016-10-21T06:44:34Z APEXMALHAR-2312 Fix NullPointerException for FileSplitterInput Operator if filepath is specified. Problem Description: --- 1) TimeBasedDirectoryScanner threads part of scanservice tries to scan the directories/files. 2) Each thread checks with help of isIterationCompleted() [referenceTimes] method whether scanned of last iteration are processed by operator thread. 3) Previously it used to work because HashMap (referenceTimes) used to return null even if last scanned directory path is null. 4) Recently referenceTimes is changed to ConcurrentHashMap, so get() doesn't allow null key's passed to ConcurrentHashMap get() method. 5) Hence NullPointerException is seen as if only file path is provided directory path would be empty hence key would be empty. Solution: - Pre-check that directory path is null then we have completed last iterations if only filepath is provided. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (APEXMALHAR-2308) BlockReader must consider fixed width length in emitted block alignment.
Deepak Narkhede created APEXMALHAR-2308: --- Summary: BlockReader must consider fixed width length in emitted block alignment. Key: APEXMALHAR-2308 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2308 Project: Apache Apex Malhar Issue Type: Bug Reporter: Deepak Narkhede Assignee: Deepak Narkhede Current BlockReader doesn't consider the fixed length mode while emitting blocks, as single tuple may be split across two blocks hence leading to data inconsistency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #457: APEXMALHAR 2302 Add blockSize property to FSR...
GitHub user deepak-narkhede reopened a pull request: https://github.com/apache/apex-malhar/pull/457 APEXMALHAR 2302 Add blockSize property to FSRecordReaderModule This change adds blockSize property from FileSplitter to FSRecordReaderModule. Tested with RecordReader Application. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2302 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/457.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #457 commit 4d52444f40f8e5bd1eb99bb54ca49574439202d4 Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2016-10-17T16:19:27Z APEXMALHAR-2302 Add blockSize property to FSRecordReaderModule. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #457: APEXMALHAR 2302 Add blockSize property to FSR...
Github user deepak-narkhede closed the pull request at: https://github.com/apache/apex-malhar/pull/457 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #457: APEXMALHAR 2302 Add blockSize property to FSR...
Github user deepak-narkhede closed the pull request at: https://github.com/apache/apex-malhar/pull/457 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #457: APEXMALHAR 2302 Add blockSize property to FSR...
GitHub user deepak-narkhede reopened a pull request: https://github.com/apache/apex-malhar/pull/457 APEXMALHAR 2302 Add blockSize property to FSRecordReaderModule This change adds blockSize property from FileSplitter to FSRecordReaderModule. Tested with RecordReader Application. You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-malhar APEXMALHAR-2302 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/457.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #457 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (APEXMALHAR-2302) Add blockSize property to FSRecordReaderModule
Deepak Narkhede created APEXMALHAR-2302: --- Summary: Add blockSize property to FSRecordReaderModule Key: APEXMALHAR-2302 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2302 Project: Apache Apex Malhar Issue Type: Improvement Reporter: Deepak Narkhede Assignee: Deepak Narkhede Priority: Minor Exposing the blockSize property of FSSplitter operator to FSRecordReaderModule. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Generic Malhar operator to get formatted data.
Hi Priyanka, Yes it is template engine. Initial use case to target is to generate automated emails using template for alerts, logging might be also for monitoring. Also to convert POJO's or Map or List in XML or JSON format to further absorb by other operators depending on templates. Thanks, Deepak On Tue, Oct 4, 2016 at 1:46 PM, Priyanka Gugale <priya...@datatorrent.com> wrote: > As far as I know "freemarker" is a template engine. How does it help us to > achieve our usecase? Like how does it help in parsing map or POJO. Also how > does it help in generating output of forms other than HTML? > > -Priyanla > > On Tue, Oct 4, 2016 at 12:54 PM, Deepak Narkhede <mailtodeep...@gmail.com> > wrote: > > > Hi Folks, > > > > Planning to write an malhar operator which will take Map or POJO as input > > and provide output as formatted data string as per specified template > > data. > > > > Use Cases: > > Get data in output data format like XML, HTML etc. > > Generate automated emails etc. > > Generate configuration files also source code in some cases. > > Generate Date/time format etc. > > > > How: > > Planning to use template engine library "Freemarker" currently under > Apache > > License. > > Investigated libraries like freemarker, thymeleaf and velocity. > > > > Strength of Freemarker with respect ot others mentioned above: > > Ease of use, almost zero dependencies, light weight (faster processing) > > > > Please let me know your thoughts/suggestions on this. > > > > Thanks, > > Deepak > > > > > > > > > > > > > > > > -- > > Thanks & Regards > > > > Deepak Narkhede > > > -- Thanks & Regards Deepak Narkhede
Generic Malhar operator to get formatted data.
Hi Folks, Planning to write an malhar operator which will take Map or POJO as input and provide output as formatted data string as per specified template data. Use Cases: Get data in output data format like XML, HTML etc. Generate automated emails etc. Generate configuration files also source code in some cases. Generate Date/time format etc. How: Planning to use template engine library "Freemarker" currently under Apache License. Investigated libraries like freemarker, thymeleaf and velocity. Strength of Freemarker with respect ot others mentioned above: Ease of use, almost zero dependencies, light weight (faster processing) Please let me know your thoughts/suggestions on this. Thanks, Deepak -- Thanks & Regards Deepak Narkhede
[jira] [Updated] (APEXCORE-542) Fix debug level verbose option for apex cli
[ https://issues.apache.org/jira/browse/APEXCORE-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Narkhede updated APEXCORE-542: - Description: Fix debug level verbose option for apex cli. Currently "" option displays INFO level but it must display debug level messages. (was: Add DEBUG level verbose option to apex cli. Currently DEBUG level is uder default logLevel. ) > Fix debug level verbose option for apex cli > --- > > Key: APEXCORE-542 > URL: https://issues.apache.org/jira/browse/APEXCORE-542 > Project: Apache Apex Core > Issue Type: Bug > Reporter: Deepak Narkhede >Assignee: Deepak Narkhede >Priority: Minor > > Fix debug level verbose option for apex cli. Currently "" option displays > INFO level but it must display debug level messages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXCORE-542) Fix debug level verbose option for apex cli
[ https://issues.apache.org/jira/browse/APEXCORE-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Narkhede updated APEXCORE-542: - Summary: Fix debug level verbose option for apex cli (was: Add DEBUG level verbose option to apex cli) > Fix debug level verbose option for apex cli > --- > > Key: APEXCORE-542 > URL: https://issues.apache.org/jira/browse/APEXCORE-542 > Project: Apache Apex Core > Issue Type: Bug > Reporter: Deepak Narkhede > Assignee: Deepak Narkhede >Priority: Minor > > Add DEBUG level verbose option to apex cli. Currently DEBUG level is uder > default logLevel. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXCORE-542) Add DEBUG level verbose option to apex cli
Deepak Narkhede created APEXCORE-542: Summary: Add DEBUG level verbose option to apex cli Key: APEXCORE-542 URL: https://issues.apache.org/jira/browse/APEXCORE-542 Project: Apache Apex Core Issue Type: Bug Reporter: Deepak Narkhede Assignee: Deepak Narkhede Priority: Minor Add DEBUG level verbose option to apex cli. Currently DEBUG level is uder default logLevel. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-core pull request #399: APEXCORE-310 - Apex cli support to kill the app...
GitHub user deepak-narkhede opened a pull request: https://github.com/apache/apex-core/pull/399 APEXCORE-310 - Apex cli support to kill the app by appname This changes adds support to kill the application by application name for apex cli. Unit Testing Completed: Please find the attached tested scenarios. [testing-kill-support.txt](https://github.com/apache/apex-core/files/499819/testing-kill-support.txt) You can merge this pull request into a Git repository by running: $ git pull https://github.com/deepak-narkhede/apex-core master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-core/pull/399.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #399 commit f3c25d2a0162fc36b04b804fccbc9bec031bd78b Author: deepak-narkhede <mailtodeep...@gmail.com> Date: 2016-09-29T04:14:24Z APEXCORE-310 - Apex cli support to kill the app by appname --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Comment Edited] (APEXCORE-310) apex cli - support to kill the app by appname
[ https://issues.apache.org/jira/browse/APEXCORE-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15515968#comment-15515968 ] Deepak Narkhede edited comment on APEXCORE-310 at 9/23/16 9:44 AM: --- Current Status: Fix is completed carrying on more unit tests for the fix. was (Author: deepak-narkhede): Current Status: Fix has completed carrying on more unit tests for the fix. > apex cli - support to kill the app by appname > - > > Key: APEXCORE-310 > URL: https://issues.apache.org/jira/browse/APEXCORE-310 > Project: Apache Apex Core > Issue Type: Improvement >Reporter: Sandesh > Assignee: Deepak Narkhede >Priority: Minor > > dtcli should support the ability to kill the app by appname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXCORE-310) apex cli - support to kill the app by appname
[ https://issues.apache.org/jira/browse/APEXCORE-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15515968#comment-15515968 ] Deepak Narkhede commented on APEXCORE-310: -- Current Status: Fix has completed carrying on more unit tests for the fix. > apex cli - support to kill the app by appname > - > > Key: APEXCORE-310 > URL: https://issues.apache.org/jira/browse/APEXCORE-310 > Project: Apache Apex Core > Issue Type: Improvement >Reporter: Sandesh > Assignee: Deepak Narkhede >Priority: Minor > > dtcli should support the ability to kill the app by appname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation
[ https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15512979#comment-15512979 ] Deepak Narkhede commented on APEXCORE-528: -- Hi Alex, Are you currently working on it. If not, Please let me know I would like to take it up. Thanks, Deepak > Output Ports Not Optional by Default During Validation > -- > > Key: APEXCORE-528 > URL: https://issues.apache.org/jira/browse/APEXCORE-528 > Project: Apache Apex Core > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Alex McCullough >Assignee: Alex McCullough >Priority: Minor > > The 'optional' OutputPortFieldAnnotation states that the default value is > true. When you build a DAG with multiple output ports the validator throws an > error telling you at least one must be connected. To fix, you must explicitly > add the annotation to all output ports and set the value to True, which is > supposed to already be the default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (APEXCORE-310) apex cli - support to kill the app by appname
[ https://issues.apache.org/jira/browse/APEXCORE-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Narkhede reassigned APEXCORE-310: Assignee: Deepak Narkhede > apex cli - support to kill the app by appname > - > > Key: APEXCORE-310 > URL: https://issues.apache.org/jira/browse/APEXCORE-310 > Project: Apache Apex Core > Issue Type: Improvement >Reporter: Sandesh > Assignee: Deepak Narkhede >Priority: Minor > > dtcli should support the ability to kill the app by appname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)