Re: Writing files to MapR File system using putHDFS

2016-06-16 Thread Sumanth Chinthagunta
I tried 0.7.0 branch and still getting same error:(

git checkout -b 0.x origin/0.x

notice nidi-env.sh is added to bin now !

I have java in my path , why the script is asking to set JAVA_HOME ?

PS: I am building on Windows , moving the zip to linux and running NiFi on 
linux. (I fixed line ending covering from dos -> unix in the scripts) 

-sumo 


> On Jun 16, 2016, at 5:07 PM, Sumanth Chinthagunta  wrote:
> 
> 
> I did a new build with master branch with
> 
> mvn -T 2.0C clean install -DskipTests -Pmapr -Dhadoop.version=2.5.1-mapr-1503
> 
> But when I try to start NiFi , got error:  
> $ ./bin/nifi.sh start
> nifi.sh: JAVA_HOME not set; results may vary
>  
> then I set JAVA_HOME and still getting starting error. Am I missing something 
> ? 
> $ ./bin/nifi.sh start
> does not exist. Exiting.
> 
> Thanks 
> Sumo
> 
>> On Jun 15, 2016, at 11:23 PM, Andre > > wrote:
>> 
>> Sumanth,
>> 
>> I currently have only being doing tests with master but as far as I know it 
>> should work with 0.x branch as well
>> 
>> Cheers
>> 
>> On Thu, Jun 16, 2016 at 3:33 PM, Sumanth Chinthagunta > > wrote:
>> 
>> Thanks for instructions Andre.
>> Should we build with master branch from github or 0.6.1 ?
>> Hope this helps me to use kites and hbase processors with MapR
>> -Sumo
>> Sent from my iPhone
>> 
> 



Re: Writing files to MapR File system using putHDFS

2016-06-16 Thread Sumanth Chinthagunta

I did a new build with master branch with

mvn -T 2.0C clean install -DskipTests -Pmapr -Dhadoop.version=2.5.1-mapr-1503

But when I try to start NiFi , got error:  
$ ./bin/nifi.sh start
nifi.sh: JAVA_HOME not set; results may vary
 
then I set JAVA_HOME and still getting starting error. Am I missing something ? 
$ ./bin/nifi.sh start
does not exist. Exiting.

Thanks 
Sumo

> On Jun 15, 2016, at 11:23 PM, Andre  wrote:
> 
> Sumanth,
> 
> I currently have only being doing tests with master but as far as I know it 
> should work with 0.x branch as well
> 
> Cheers
> 
> On Thu, Jun 16, 2016 at 3:33 PM, Sumanth Chinthagunta  > wrote:
> 
> Thanks for instructions Andre.
> Should we build with master branch from github or 0.6.1 ?
> Hope this helps me to use kites and hbase processors with MapR
> -Sumo
> Sent from my iPhone
> 



Re: Scheduling using CRON driven on Windows OS

2016-06-16 Thread Keith Lim
Thanks Mark.  I appreciate the effort and the quality work that you guys are 
doing.


Thanks,
keith


From: Mark Payne 
Sent: Thursday, June 16, 2016 3:13 PM
To: users@nifi.apache.org
Subject: Re: Scheduling using CRON driven on Windows OS

Keith,

I believe there already is a PR for this. I did an initial review and things 
looked good but it touches some very critical parts of the application and 
needs to be scrutinized and reviewed much more thoroughly before being merged 
in.

Thanks
-Mark

Sent from my iPhone

On Jun 16, 2016, at 5:39 PM, Keith Lim 
> wrote:


Just found out from my coworker the NiFi is running on a cluster of 20 nodes, 
hence the 20 flowfiles generated.


Also found out that when the NiFi is hosted in the clustered environment, there 
is an additional option for Scheduling Strategy: On Primary Node

This options ensures that the processor only runs on the primary node (single 
node) using the run schedule time defined.

So, as it is I can do Cron but have to be subjected to the number of nodes in 
the cluster or using the On Primary Node option to run on single node but do 
not have CRON option.


I wish the On Primary Node option can be applied to both Time Driven and Cron 
Driven, i.e. a separate checkbox.


Is there any workaround for this?   If not could we prioritize this as an 
enhancement?


Thanks,
Keith


From: Bryan Bende >
Sent: Thursday, June 16, 2016 1:45 PM
To: users@nifi.apache.org
Subject: Re: Scheduling using CRON driven on Windows OS

Does 0 0/2 * * * ?  work?

Another option, if you don't care about it being on the exact time boundaries, 
you could use the timer driven scheduling and set it to 120 seconds, that would 
be every 2 minutes from when the processor starts.

On Thu, Jun 16, 2016 at 4:23 PM, Keith Lim 
> wrote:

I have a follow up question.


I want 1 flowfile generated every two minutes: 00 */2 * * * ?   ( I have tried 
* */2 * * * ?,  0 */2 * * * ?  00 00/02 * * * ? and  as well)

but instead I am seeing 20 flowfiles generated every two minutes.

I even set the Yield Duration to 5 secs just to see if it does any thing, but I 
am not able to get it to generate only 1 flowfile every two minutes.


Thanks,
Keith


From: Keith Lim >
Sent: Thursday, June 16, 2016 10:44 AM

To: users@nifi.apache.org
Subject: Re: Scheduling using CRON driven on Windows OS


Thanks for pointing this out.  My bad for not referring to the NiFi 
documentation.   When I saw the Cron Driven and presented with the default "* * 
* * * ?", my instinct was to reference a standard Linux/Unix Cron format.


Thanks,
Keith


From: Bryan Bende >
Sent: Thursday, June 16, 2016 10:19 AM
To: users@nifi.apache.org
Subject: Re: Scheduling using CRON driven on Windows OS

Also, the user guide has a description of the scheduling strategies which 
described the cron format:

https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#scheduling-tab

On Thu, Jun 16, 2016 at 1:17 PM, Pierre Villard 
> wrote:
Hi Keith,

This is the expected behavior, the first parameter is indeed seconds so that 
*/5 * * * * ? will generate a FF every 5 seconds.
In your case, I believe you'd like something like 00 02 10 * * ?

Hope this helps.

2016-06-16 19:13 GMT+02:00 Keith Lim 
>:

My GenerateFlowFile processor is enabled and started

with


Scheduling Strategy: Cron Driven

Run Schedule : 02 10 * * * ?


This I expects to generate a flow file daily at 10:02 am, but from my limited 
test, it seems to take the second parameter as minutes, and generate a flowfile 
hourly at 10 minutes after the hour, e.g. 10:10, 11:10, 12:10, 13:10...

Perhaps there is a bug in parsing system date format?


Attached is a simple template that I am using for testing.


Thanks,
Keith



From: Andrew Grande >
Sent: Wednesday, June 15, 2016 4:10 PM
To: users@nifi.apache.org
Subject: Re: Scheduling using CRON driven on Windows OS


Keith,

Was your processor running at all times? It has to be started and enabled.

I guess sharing the cron expression and maybe a quick screenshot will help next.

Andrew

On Wed, Jun 15, 2016, 6:58 PM Keith Lim 
> wrote:
I tried setting Scheduling Strategy property to CRON Driven but does not seem 
to work.   Sometimes it would fire when not expected to and others not fire 
when 

Re: Scheduling using CRON driven on Windows OS

2016-06-16 Thread Mark Payne
Keith,

I believe there already is a PR for this. I did an initial review and things 
looked good but it touches some very critical parts of the application and 
needs to be scrutinized and reviewed much more thoroughly before being merged 
in. 

Thanks
-Mark

Sent from my iPhone

> On Jun 16, 2016, at 5:39 PM, Keith Lim  wrote:
> 
> Just found out from my coworker the NiFi is running on a cluster of 20 nodes, 
> hence the 20 flowfiles generated.  
> 
> 
> 
> Also found out that when the NiFi is hosted in the clustered environment, 
> there is an additional option for Scheduling Strategy: On Primary Node
> This options ensures that the processor only runs on the primary node (single 
> node) using the run schedule time defined. 
> 
> So, as it is I can do Cron but have to be subjected to the number of nodes in 
> the cluster or using the On Primary Node option to run on single node but do 
> not have CRON option. 
> 
> 
> I wish the On Primary Node option can be applied to both Time Driven and Cron 
> Driven, i.e. a separate checkbox.
> 
> 
> Is there any workaround for this?   If not could we prioritize this as an 
> enhancement?
> 
> 
> Thanks,
> Keith
> 
> From: Bryan Bende 
> Sent: Thursday, June 16, 2016 1:45 PM
> To: users@nifi.apache.org
> Subject: Re: Scheduling using CRON driven on Windows OS
>  
> Does 0 0/2 * * * ?  work?
> 
> Another option, if you don't care about it being on the exact time 
> boundaries, you could use the timer driven scheduling and set it to 120 
> seconds, that would be every 2 minutes from when the processor starts.
> 
>> On Thu, Jun 16, 2016 at 4:23 PM, Keith Lim  wrote:
>> I have a follow up question.
>> 
>> 
>> I want 1 flowfile generated every two minutes: 00 */2 * * * ?   ( I have 
>> tried * */2 * * * ?,  0 */2 * * * ?  00 00/02 * * * ? and  as well)
>> 
>> but instead I am seeing 20 flowfiles generated every two minutes.
>> 
>> I even set the Yield Duration to 5 secs just to see if it does any thing, 
>> but I am not able to get it to generate only 1 flowfile every two minutes. 
>> 
>> 
>> Thanks,
>> Keith
>> 
>> 
>> From: Keith Lim 
>> Sent: Thursday, June 16, 2016 10:44 AM
>> 
>> To: users@nifi.apache.org
>> Subject: Re: Scheduling using CRON driven on Windows OS
>>  
>> Thanks for pointing this out.  My bad for not referring to the NiFi 
>> documentation.   When I saw the Cron Driven and presented with the default 
>> "* * * * * ?", my instinct was to reference a standard Linux/Unix Cron 
>> format.
>> 
>> 
>> 
>> Thanks,
>> Keith
>> 
>> From: Bryan Bende 
>> Sent: Thursday, June 16, 2016 10:19 AM
>> To: users@nifi.apache.org
>> Subject: Re: Scheduling using CRON driven on Windows OS
>>  
>> Also, the user guide has a description of the scheduling strategies which 
>> described the cron format:
>> 
>> https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#scheduling-tab
>> 
>>> On Thu, Jun 16, 2016 at 1:17 PM, Pierre Villard 
>>>  wrote:
>>> Hi Keith,
>>> 
>>> This is the expected behavior, the first parameter is indeed seconds so 
>>> that */5 * * * * ? will generate a FF every 5 seconds.
>>> In your case, I believe you'd like something like 00 02 10 * * ?
>>> 
>>> Hope this helps.
>>> 
>>> 2016-06-16 19:13 GMT+02:00 Keith Lim :
 My GenerateFlowFile processor is enabled and started
 
 with  
 
 
 
 Scheduling Strategy: Cron Driven
 
 Run Schedule : 02 10 * * * ?
 
 
 
 This I expects to generate a flow file daily at 10:02 am, but from my 
 limited test, it seems to take the second parameter as minutes, and 
 generate a flowfile hourly at 10 minutes after the hour, e.g. 10:10, 
 11:10, 12:10, 13:10...
 
 Perhaps there is a bug in parsing system date format? 
 
 Attached is a simple template that I am using for testing.   
 
 Thanks,
 Keith
 
 
  
 From: Andrew Grande 
 Sent: Wednesday, June 15, 2016 4:10 PM
 To: users@nifi.apache.org
 Subject: Re: Scheduling using CRON driven on Windows OS
  
 Keith,
 
 Was your processor running at all times? It has to be started and enabled.
 
 I guess sharing the cron expression and maybe a quick screenshot will help 
 next.
 
 Andrew
 
 
> On Wed, Jun 15, 2016, 6:58 PM Keith Lim  wrote:
> I tried setting Scheduling Strategy property to CRON Driven but does not 
> seem to work.   Sometimes it would fire when not expected to and others 
> not fire when expected to. 
> This is on Windows OS and the processor I tried was GenerateFlowFile.  Is 
> CRON Driven setting not designed to work on Windows OS?
> 
> Thanks,
> Keith
> 


Re: Scheduling using CRON driven on Windows OS

2016-06-16 Thread Keith Lim
Thanks for pointing this out.  My bad for not referring to the NiFi 
documentation.   When I saw the Cron Driven and presented with the default "* * 
* * * ?", my instinct was to reference a standard Linux/Unix Cron format.


Thanks,
Keith


From: Bryan Bende 
Sent: Thursday, June 16, 2016 10:19 AM
To: users@nifi.apache.org
Subject: Re: Scheduling using CRON driven on Windows OS

Also, the user guide has a description of the scheduling strategies which 
described the cron format:

https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#scheduling-tab

On Thu, Jun 16, 2016 at 1:17 PM, Pierre Villard 
> wrote:
Hi Keith,

This is the expected behavior, the first parameter is indeed seconds so that 
*/5 * * * * ? will generate a FF every 5 seconds.
In your case, I believe you'd like something like 00 02 10 * * ?

Hope this helps.

2016-06-16 19:13 GMT+02:00 Keith Lim 
>:

My GenerateFlowFile processor is enabled and started

with


Scheduling Strategy: Cron Driven

Run Schedule : 02 10 * * * ?


This I expects to generate a flow file daily at 10:02 am, but from my limited 
test, it seems to take the second parameter as minutes, and generate a flowfile 
hourly at 10 minutes after the hour, e.g. 10:10, 11:10, 12:10, 13:10...

Perhaps there is a bug in parsing system date format?


Attached is a simple template that I am using for testing.


Thanks,
Keith



From: Andrew Grande >
Sent: Wednesday, June 15, 2016 4:10 PM
To: users@nifi.apache.org
Subject: Re: Scheduling using CRON driven on Windows OS


Keith,

Was your processor running at all times? It has to be started and enabled.

I guess sharing the cron expression and maybe a quick screenshot will help next.

Andrew

On Wed, Jun 15, 2016, 6:58 PM Keith Lim 
> wrote:
I tried setting Scheduling Strategy property to CRON Driven but does not seem 
to work.   Sometimes it would fire when not expected to and others not fire 
when expected to.
This is on Windows OS and the processor I tried was GenerateFlowFile.  Is CRON 
Driven setting not designed to work on Windows OS?

Thanks,
Keith






Re: Scheduling using CRON driven on Windows OS

2016-06-16 Thread Bryan Bende
Also, the user guide has a description of the scheduling strategies which
described the cron format:

https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#scheduling-tab

On Thu, Jun 16, 2016 at 1:17 PM, Pierre Villard  wrote:

> Hi Keith,
>
> This is the expected behavior, the first parameter is indeed seconds so
> that */5 * * * * ? will generate a FF every 5 seconds.
> In your case, I believe you'd like something like 00 02 10 * * ?
>
> Hope this helps.
>
> 2016-06-16 19:13 GMT+02:00 Keith Lim :
>
>> My GenerateFlowFile processor is enabled and started
>>
>> with
>>
>>
>> Scheduling Strategy: Cron Driven
>>
>> Run Schedule : 02 10 * * * ?
>>
>>
>> This I expects to generate a flow file daily at 10:02 am, but from my
>> limited test, it seems to take the second parameter as minutes, and
>> generate a flowfile hourly at 10 minutes after the hour, e.g. 10:10, 11:10,
>> 12:10, 13:10...
>>
>> Perhaps there is a bug in parsing system date format?
>>
>>
>> Attached is a simple template that I am using for testing.
>>
>>
>> Thanks,
>> Keith
>>
>>
>> --
>> *From:* Andrew Grande 
>> *Sent:* Wednesday, June 15, 2016 4:10 PM
>> *To:* users@nifi.apache.org
>> *Subject:* Re: Scheduling using CRON driven on Windows OS
>>
>>
>> Keith,
>>
>> Was your processor running at all times? It has to be started and enabled.
>>
>> I guess sharing the cron expression and maybe a quick screenshot will
>> help next.
>>
>> Andrew
>>
>> On Wed, Jun 15, 2016, 6:58 PM Keith Lim  wrote:
>>
>>> I tried setting Scheduling Strategy property to CRON Driven but does not
>>> seem to work.   Sometimes it would fire when not expected to and others not
>>> fire when expected to.
>>> This is on Windows OS and the processor I tried was GenerateFlowFile.
>>> Is CRON Driven setting not designed to work on Windows OS?
>>>
>>> Thanks,
>>> Keith
>>>
>>>
>>>
>


Re: Scheduling using CRON driven on Windows OS

2016-06-16 Thread Pierre Villard
Hi Keith,

This is the expected behavior, the first parameter is indeed seconds so
that */5 * * * * ? will generate a FF every 5 seconds.
In your case, I believe you'd like something like 00 02 10 * * ?

Hope this helps.

2016-06-16 19:13 GMT+02:00 Keith Lim :

> My GenerateFlowFile processor is enabled and started
>
> with
>
>
> Scheduling Strategy: Cron Driven
>
> Run Schedule : 02 10 * * * ?
>
>
> This I expects to generate a flow file daily at 10:02 am, but from my
> limited test, it seems to take the second parameter as minutes, and
> generate a flowfile hourly at 10 minutes after the hour, e.g. 10:10, 11:10,
> 12:10, 13:10...
>
> Perhaps there is a bug in parsing system date format?
>
>
> Attached is a simple template that I am using for testing.
>
>
> Thanks,
> Keith
>
>
> --
> *From:* Andrew Grande 
> *Sent:* Wednesday, June 15, 2016 4:10 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: Scheduling using CRON driven on Windows OS
>
>
> Keith,
>
> Was your processor running at all times? It has to be started and enabled.
>
> I guess sharing the cron expression and maybe a quick screenshot will help
> next.
>
> Andrew
>
> On Wed, Jun 15, 2016, 6:58 PM Keith Lim  wrote:
>
>> I tried setting Scheduling Strategy property to CRON Driven but does not
>> seem to work.   Sometimes it would fire when not expected to and others not
>> fire when expected to.
>> This is on Windows OS and the processor I tried was GenerateFlowFile.  Is
>> CRON Driven setting not designed to work on Windows OS?
>>
>> Thanks,
>> Keith
>>
>>
>>


Re: Scheduling using CRON driven on Windows OS

2016-06-16 Thread Keith Lim
My GenerateFlowFile processor is enabled and started

with


Scheduling Strategy: Cron Driven

Run Schedule : 02 10 * * * ?


This I expects to generate a flow file daily at 10:02 am, but from my limited 
test, it seems to take the second parameter as minutes, and generate a flowfile 
hourly at 10 minutes after the hour, e.g. 10:10, 11:10, 12:10, 13:10...

Perhaps there is a bug in parsing system date format?


Attached is a simple template that I am using for testing.


Thanks,
Keith



From: Andrew Grande 
Sent: Wednesday, June 15, 2016 4:10 PM
To: users@nifi.apache.org
Subject: Re: Scheduling using CRON driven on Windows OS


Keith,

Was your processor running at all times? It has to be started and enabled.

I guess sharing the cron expression and maybe a quick screenshot will help next.

Andrew

On Wed, Jun 15, 2016, 6:58 PM Keith Lim 
> wrote:
I tried setting Scheduling Strategy property to CRON Driven but does not seem 
to work.   Sometimes it would fire when not expected to and others not fire 
when expected to.
This is on Windows OS and the processor I tried was GenerateFlowFile.  Is CRON 
Driven setting not designed to work on Windows OS?

Thanks,
Keith


TestStartOfWorkflowWithCron0dccd60a-e6b1-4cb0-b456-b7871d868079d00205e6-5e99-44aa-9309-5a03d29f81d30 MB0d00205e6-5e99-44aa-9309-5a03d29f81d3091dda05-efac-4d43-8db3-245e84977b5cPROCESSOR0 sec1successd00205e6-5e99-44aa-9309-5a03d29f81d3f4c3f4ea-0a4c-4c38-8928-870300525884PROCESSOR0f4c3f4ea-0a4c-4c38-8928-870300525884d00205e6-5e99-44aa-9309-5a03d29f81d3-4968.752054017811-1508.5169639774704WARN1TIMER_DRIVEN1EVENT_DRIVEN0CRON_DRIVEN1TIMER_DRIVEN0 secCRON_DRIVEN* * * * * ?File SizeThe size of the file that will be usedFile SizefalseFile SizetruefalsefalseBatch Size1The number of FlowFiles to be transferred in each invocationBatch SizefalseBatch SizetruefalsefalseData FormatBinaryBinaryTextTextBinarySpecifies whether the data should be Text or BinaryData FormatfalseData FormattruefalsefalseUnique FlowFilestruetruefalsefalsefalseIf true, each FlowFile that is generated will be unique. If false, a random value will be generated and all FlowFiles will get the same content but this offers much higher throughputUnique FlowFilesfalseUnique FlowFilestruefalsefalsefalse30 secFile Size1 BBatch Size1Data FormatTextUnique FlowFilesfalse002 10 * * * ?CRON_DRIVEN0 secTestStartOfWorkflowWithCronfalsesuccessRUNNINGfalsetrueorg.apache.nifi.processors.standard.GenerateFlowFile091dda05-efac-4d43-8db3-245e84977b5cd00205e6-5e99-44aa-9309-5a03d29f81d3-4952.56230500716-1309.186652899959WARN1TIMER_DRIVEN1EVENT_DRIVEN0CRON_DRIVEN1TIMER_DRIVEN0 secCRON_DRIVEN* * * * * ?DirectoryThe directory to which files should be written. You may use expression language such as /aa/bb/${path}DirectoryfalseDirectorytruefalsetrueConflict Resolution StrategyreplacereplaceignoreignorefailfailfailIndicates what should happen when a file with the same name already exists in the output directoryConflict Resolution StrategyfalseConflict Resolution StrategytruefalsefalseCreate Missing DirectoriestruetruefalsefalsetrueIf true, then missing destination directories will be created. If false, flowfiles are penalized and sent to failure.Create Missing DirectoriesfalseCreate Missing DirectoriestruefalsefalseMaximum File CountSpecifies the maximum number of files that can exist in the output directoryMaximum File CountfalseMaximum File CountfalsefalsefalseLast Modified TimeSets the lastModifiedTime on the output file to the value of this attribute.  Format must be -MM-dd'T'HH:mm:ssZ.  You may also use expression language such as ${file.lastModifiedTime}.Last Modified TimefalseLast Modified TimefalsefalsetruePermissionsSets the permissions on the output file to the value of this attribute.  Format must be either UNIX rwxrwxrwx with a - in place of denied permissions (e.g. rw-r--r--) or an octal number (e.g. 644).  You may also use expression language such as ${file.permissions}.PermissionsfalsePermissionsfalsefalsetrueOwnerSets the owner on the output file to the value of this attribute.  You may also use expression language such as ${file.owner}.OwnerfalseOwnerfalsefalsetrueGroupSets the group on the output file to the value of this attribute.  You may also use expression language such as ${file.group}.GroupfalseGroupfalsefalsetruefalse30 secDirectoryC:\Temp\LogToFileConflict Resolution StrategyfailCreate Missing DirectoriestrueMaximum File CountLast Modified TimePermissionsOwnerGroup00 secTIMER_DRIVEN1 secLogToFiletrueFiles that could not be written to the output directory for some reason are transferred to this relationshipfailuretrueFiles that have been successfully written to the output directory are transferred to this relationshipsuccessDISABLEDbackground-color#fffalsetrueorg.apache.nifi.processors.standard.PutFile06/16/2016 10:03:57 PDT