[jira] [Assigned] (NIFI-5492) UDF in Expression Language

2021-11-16 Thread Ed Berezitsky (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky reassigned NIFI-5492:
---

Assignee: (was: Ed Berezitsky)

> UDF in Expression Language
> --
>
> Key: NIFI-5492
> URL: https://issues.apache.org/jira/browse/NIFI-5492
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Core Framework
>Affects Versions: 1.5.0, 1.6.0, 1.7.0, 1.7.1
>Reporter: Ed Berezitsky
>Priority: Major
>  Labels: features, patch
> Attachments: 0001-NIFI-5492_EXEC-Adding-UDF-to-EL.patch
>
>
> Set of functions available to use in expression language is limited by 
> predefined ones.
> This request is to provide an ability to plug in custom/user defined 
> functions.
> For example:
> ${*exec*('com.example.MyUDF', 'param1', 'param2')}
> Should be able to support:
>  # Multiple, not limited number of parameters (including zero params)
>  # Param data types should  support all EL data types (dates, whole numbers, 
> decimals, strings, booleans)
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (NIFI-5965) Add TTL for JMS-related Functionality

2021-11-16 Thread Ed Berezitsky (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky reassigned NIFI-5965:
---

Assignee: (was: Ed Berezitsky)

> Add TTL for JMS-related Functionality
> -
>
> Key: NIFI-5965
> URL: https://issues.apache.org/jira/browse/NIFI-5965
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Priority: Major
>
> As a user, I would like to define TTL (time to live) JNDI Controller Service 
> and PublishJMS/ConsumeJMS processors.
> Use case:
>  * JNDI mapping can be changed, but JNDI controller service doesn't refresh 
> configuration (connection factory).
> Functionality to be implemented:
>  # Add TTL for connection factories obtained by JNDI controller service.
>  # Add TTL for connections cached by PublishJMS/ConsumeJMS processors.
>  # Default TTL in both cases should be "0" to indicate "no-refresh required" 
> (backward compability)
>  # TTL parameter should not be required (backward compatibility)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (NIFI-8216) Glue Schema Registry Integration

2021-02-09 Thread Ed Berezitsky (Jira)
Ed Berezitsky created NIFI-8216:
---

 Summary: Glue Schema Registry Integration
 Key: NIFI-8216
 URL: https://issues.apache.org/jira/browse/NIFI-8216
 Project: Apache NiFi
  Issue Type: New Feature
  Components: Extensions
Reporter: Ed Berezitsky


[Glue Schema Registry 
(GSR)|https://docs.aws.amazon.com/glue/latest/dg/schema-registry.html] is 
available in addition to Confluent and HWX ones.

Suggested new Feature: new Schema Registry Controller for GSR.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (NIFI-6181) FetchSFTP and FetchFTP should not emit error if file not found

2019-04-04 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-6181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-6181:

Fix Version/s: 1.9.1
   Attachment: 3407.patch.txt
   Status: Patch Available  (was: Open)

> FetchSFTP and FetchFTP should not emit error if file not found
> --
>
> Key: NIFI-6181
> URL: https://issues.apache.org/jira/browse/NIFI-6181
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Affects Versions: 1.9.1
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, usability
> Fix For: 1.9.1
>
> Attachments: 3407.patch.txt
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently FetchSFTP processor sends flow file to relationship "not.found", 
> but still prints error into log and into bulletin.
> Since "not found" is dedicated relationship, there is no need for error to be 
> printed. It affects NIFI bulletin-based (or log-based) monitoring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-6181) FetchSFTP and FetchFTP should not emit error if file not found

2019-04-03 Thread Ed Berezitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-6181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809508#comment-16809508
 ] 

Ed Berezitsky commented on NIFI-6181:
-

Added FetchFTP to the description.

FetchFTP also has a bug on handling error caused by File Not Found - it returns 
IOException instead. This can be fixed in FTPTransfer class.

> FetchSFTP and FetchFTP should not emit error if file not found
> --
>
> Key: NIFI-6181
> URL: https://issues.apache.org/jira/browse/NIFI-6181
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Affects Versions: 1.9.1
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, usability
>
> Currently FetchSFTP processor sends flow file to relationship "not.found", 
> but still prints error into log and into bulletin.
> Since "not found" is dedicated relationship, there is no need for error to be 
> printed. It affects NIFI bulletin-based (or log-based) monitoring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-6181) FetchSFTP and FetchFTP should not emit error if file not found

2019-04-03 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-6181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-6181:

Summary: FetchSFTP and FetchFTP should not emit error if file not found  
(was: FetchSFTP should not emit error if file not found)

> FetchSFTP and FetchFTP should not emit error if file not found
> --
>
> Key: NIFI-6181
> URL: https://issues.apache.org/jira/browse/NIFI-6181
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Affects Versions: 1.9.1
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, usability
>
> Currently FetchSFTP processor sends flow file to relationship "not.found", 
> but still prints error into log and into bulletin.
> Since "not found" is dedicated relationship, there is no need for error to be 
> printed. It affects NIFI bulletin-based (or log-based) monitoring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-6181) FetchSFTP should not emit error if file not found

2019-04-03 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-6181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky reassigned NIFI-6181:
---

Assignee: Ed Berezitsky

> FetchSFTP should not emit error if file not found
> -
>
> Key: NIFI-6181
> URL: https://issues.apache.org/jira/browse/NIFI-6181
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Affects Versions: 1.9.1
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, usability
>
> Currently FetchSFTP processor sends flow file to relationship "not.found", 
> but still prints error into log and into bulletin.
> Since "not found" is dedicated relationship, there is no need for error to be 
> printed. It affects NIFI bulletin-based (or log-based) monitoring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-6181) FetchSFTP should not emit error if file not found

2019-04-03 Thread Ed Berezitsky (JIRA)
Ed Berezitsky created NIFI-6181:
---

 Summary: FetchSFTP should not emit error if file not found
 Key: NIFI-6181
 URL: https://issues.apache.org/jira/browse/NIFI-6181
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Affects Versions: 1.9.1
Reporter: Ed Berezitsky


Currently FetchSFTP processor sends flow file to relationship "not.found", but 
still prints error into log and into bulletin.

Since "not found" is dedicated relationship, there is no need for error to be 
printed. It affects NIFI bulletin-based (or log-based) monitoring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5965) Add TTL for JMS-related Functionality

2019-01-19 Thread Ed Berezitsky (JIRA)
Ed Berezitsky created NIFI-5965:
---

 Summary: Add TTL for JMS-related Functionality
 Key: NIFI-5965
 URL: https://issues.apache.org/jira/browse/NIFI-5965
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Affects Versions: 1.8.0
Reporter: Ed Berezitsky
Assignee: Ed Berezitsky


As a user, I would like to define TTL (time to live) JNDI Controller Service 
and PublishJMS/ConsumeJMS processors.

Use case:
 * JNDI mapping can be changed, but JNDI controller service doesn't refresh 
configuration (connection factory).

Functionality to be implemented:
 # Add TTL for connection factories obtained by JNDI controller service.
 # Add TTL for connections cached by PublishJMS/ConsumeJMS processors.
 # Default TTL in both cases should be "0" to indicate "no-refresh required" 
(backward compability)
 # TTL parameter should not be required (backward compatibility)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5869) JMS Connection Fails After JMS servers Change behind JNDI

2019-01-19 Thread Ed Berezitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16747212#comment-16747212
 ] 

Ed Berezitsky commented on NIFI-5869:
-

update patch

> JMS Connection Fails After JMS servers Change behind JNDI
> -
>
> Key: NIFI-5869
> URL: https://issues.apache.org/jira/browse/NIFI-5869
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
> Attachments: 3261.patch.txt, JNDI_JMS_Exception.txt
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> JMS Connection Fails After JMS servers Change behind JNDI.
> Reproduce:
>  # Define and enable JNDI Controller Service
>  # Create a flow with ConsumeJMS or PublishJMS processors with controller 
> service defined in #1.
>  # Consume and publish at least one message to ensure the connectivity can be 
> established.
>  # Change JNDI configuration for the same connection factory to point to new 
> JMS servers.
>  # Stop JMS service on previous servers
>  # Observe failure in ConsumeJMS/PublishJMS (Caused by: 
> javax.jms.JMSException: Failed to connect to any server at: 
> tcp://jms_server1:12345)
>  
> Work Around:
>  # Disable JNDI Controller Service
>  # Enable JNDI Controller Service and dependent processors.
>  
> Possible Issue/Fix:
>  * AbstractJMSProcessor has a method "buildTargetResource", in which 
> connection factory is instantiated and then cached in workerPool in onTrigger 
> .
>  * Issue: Once cached, it will be reused forever.
>  * Fix: on connectivity failure there should be an attempt to rebuild the 
> worker. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5869) JMS Connection Fails After JMS servers Change behind JNDI

2019-01-19 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5869:

Attachment: (was: 3261.patch.txt)

> JMS Connection Fails After JMS servers Change behind JNDI
> -
>
> Key: NIFI-5869
> URL: https://issues.apache.org/jira/browse/NIFI-5869
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
> Attachments: 3261.patch.txt, JNDI_JMS_Exception.txt
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> JMS Connection Fails After JMS servers Change behind JNDI.
> Reproduce:
>  # Define and enable JNDI Controller Service
>  # Create a flow with ConsumeJMS or PublishJMS processors with controller 
> service defined in #1.
>  # Consume and publish at least one message to ensure the connectivity can be 
> established.
>  # Change JNDI configuration for the same connection factory to point to new 
> JMS servers.
>  # Stop JMS service on previous servers
>  # Observe failure in ConsumeJMS/PublishJMS (Caused by: 
> javax.jms.JMSException: Failed to connect to any server at: 
> tcp://jms_server1:12345)
>  
> Work Around:
>  # Disable JNDI Controller Service
>  # Enable JNDI Controller Service and dependent processors.
>  
> Possible Issue/Fix:
>  * AbstractJMSProcessor has a method "buildTargetResource", in which 
> connection factory is instantiated and then cached in workerPool in onTrigger 
> .
>  * Issue: Once cached, it will be reused forever.
>  * Fix: on connectivity failure there should be an attempt to rebuild the 
> worker. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5869) JMS Connection Fails After JMS servers Change behind JNDI

2019-01-19 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5869:

Attachment: 3261.patch.txt

> JMS Connection Fails After JMS servers Change behind JNDI
> -
>
> Key: NIFI-5869
> URL: https://issues.apache.org/jira/browse/NIFI-5869
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
> Attachments: 3261.patch.txt, JNDI_JMS_Exception.txt
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> JMS Connection Fails After JMS servers Change behind JNDI.
> Reproduce:
>  # Define and enable JNDI Controller Service
>  # Create a flow with ConsumeJMS or PublishJMS processors with controller 
> service defined in #1.
>  # Consume and publish at least one message to ensure the connectivity can be 
> established.
>  # Change JNDI configuration for the same connection factory to point to new 
> JMS servers.
>  # Stop JMS service on previous servers
>  # Observe failure in ConsumeJMS/PublishJMS (Caused by: 
> javax.jms.JMSException: Failed to connect to any server at: 
> tcp://jms_server1:12345)
>  
> Work Around:
>  # Disable JNDI Controller Service
>  # Enable JNDI Controller Service and dependent processors.
>  
> Possible Issue/Fix:
>  * AbstractJMSProcessor has a method "buildTargetResource", in which 
> connection factory is instantiated and then cached in workerPool in onTrigger 
> .
>  * Issue: Once cached, it will be reused forever.
>  * Fix: on connectivity failure there should be an attempt to rebuild the 
> worker. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5869) JMS Connection Fails After JMS servers Change behind JNDI

2019-01-18 Thread Ed Berezitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746902#comment-16746902
 ] 

Ed Berezitsky commented on NIFI-5869:
-

Fix provided in PR #3261:

When worker fails to connect to JMS, it will set a worker into invalid state.

A processor (ConsumeJMS/PublishJMS) will get Controller Service and call 
"resetConnectionFactory", then will try to rebuild a worker.

Fix has been tested as described for reproduction. Also regression tests has 
been performed against live JMS server.

> JMS Connection Fails After JMS servers Change behind JNDI
> -
>
> Key: NIFI-5869
> URL: https://issues.apache.org/jira/browse/NIFI-5869
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
> Attachments: 3261.patch.txt, JNDI_JMS_Exception.txt
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> JMS Connection Fails After JMS servers Change behind JNDI.
> Reproduce:
>  # Define and enable JNDI Controller Service
>  # Create a flow with ConsumeJMS or PublishJMS processors with controller 
> service defined in #1.
>  # Consume and publish at least one message to ensure the connectivity can be 
> established.
>  # Change JNDI configuration for the same connection factory to point to new 
> JMS servers.
>  # Stop JMS service on previous servers
>  # Observe failure in ConsumeJMS/PublishJMS (Caused by: 
> javax.jms.JMSException: Failed to connect to any server at: 
> tcp://jms_server1:12345)
>  
> Work Around:
>  # Disable JNDI Controller Service
>  # Enable JNDI Controller Service and dependent processors.
>  
> Possible Issue/Fix:
>  * AbstractJMSProcessor has a method "buildTargetResource", in which 
> connection factory is instantiated and then cached in workerPool in onTrigger 
> .
>  * Issue: Once cached, it will be reused forever.
>  * Fix: on connectivity failure there should be an attempt to rebuild the 
> worker. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5869) JMS Connection Fails After JMS servers Change behind JNDI

2019-01-10 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5869:

Fix Version/s: (was: 1.8.0)
   Attachment: 3261.patch.txt
   Status: Patch Available  (was: In Progress)

> JMS Connection Fails After JMS servers Change behind JNDI
> -
>
> Key: NIFI-5869
> URL: https://issues.apache.org/jira/browse/NIFI-5869
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
> Attachments: 3261.patch.txt, JNDI_JMS_Exception.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> JMS Connection Fails After JMS servers Change behind JNDI.
> Reproduce:
>  # Define and enable JNDI Controller Service
>  # Create a flow with ConsumeJMS or PublishJMS processors with controller 
> service defined in #1.
>  # Consume and publish at least one message to ensure the connectivity can be 
> established.
>  # Change JNDI configuration for the same connection factory to point to new 
> JMS servers.
>  # Stop JMS service on previous servers
>  # Observe failure in ConsumeJMS/PublishJMS (Caused by: 
> javax.jms.JMSException: Failed to connect to any server at: 
> tcp://jms_server1:12345)
>  
> Work Around:
>  # Disable JNDI Controller Service
>  # Enable JNDI Controller Service and dependent processors.
>  
> Possible Issue/Fix:
>  * AbstractJMSProcessor has a method "buildTargetResource", in which 
> connection factory is instantiated and then cached in workerPool in onTrigger 
> .
>  * Issue: Once cached, it will be reused forever.
>  * Fix: on connectivity failure there should be an attempt to rebuild the 
> worker. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (NIFI-5909) PutElasticsearchHttpRecord doesn't allow to customize the timestamp format

2019-01-09 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky resolved NIFI-5909.
-
   Resolution: Fixed
Fix Version/s: 1.9.0

This is addressed by PR #3227

> PutElasticsearchHttpRecord doesn't allow to customize the timestamp format
> --
>
> Key: NIFI-5909
> URL: https://issues.apache.org/jira/browse/NIFI-5909
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Alex Savitsky
>Assignee: Alex Savitsky
>Priority: Major
> Fix For: 1.9.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> All timestamps are sent to Elasticsearch in the "-MM-dd HH:mm:ss" format, 
> coming from the RecordFieldType.TIMESTAMP.getDefaultFormat(). There's plenty 
> of use cases that call for Elasticsearch data to be presented differently, 
> and the format should be customizable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (NIFI-5937) PutElasticsearchHttpRecord uses system default encoding

2019-01-08 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky resolved NIFI-5937.
-
   Resolution: Fixed
Fix Version/s: 1.9.0

This is addressed by PR #3250

> PutElasticsearchHttpRecord uses system default encoding
> ---
>
> Key: NIFI-5937
> URL: https://issues.apache.org/jira/browse/NIFI-5937
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Alex Savitsky
>Assignee: Alex Savitsky
>Priority: Major
> Fix For: 1.9.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> PutElasticsearchHttpRecord line 348:
> {code:java}
> json.append(out.toString());
> {code}
> This results in the conversion being done using system default encoding, 
> possibly garbling non-ASCII characters in the output. Should use the encoding 
> configured in the processor in the toString call.
> As a workaround, the "file.encoding" system property can be specified 
> explicitly in the bootstrap.conf:
> {code:java}
> java.arg.7=-Dfile.encoding=UTF-8{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5826) UpdateRecord processor throwing PatternSyntaxException

2019-01-08 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5826:

   Resolution: Fixed
 Assignee: Koji Kawamura  (was: Ed Berezitsky)
Fix Version/s: 1.9.0
   Status: Resolved  (was: Patch Available)

This is addressed by PR3200

> UpdateRecord processor throwing PatternSyntaxException
> --
>
> Key: NIFI-5826
> URL: https://issues.apache.org/jira/browse/NIFI-5826
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.5.0, 1.6.0, 1.7.0, 1.8.0, 1.7.1
> Environment: Nifi in docker container
>Reporter: ravi kargam
>Assignee: Koji Kawamura
>Priority: Minor
> Fix For: 1.9.0
>
> Attachments: NIFI-5826_PR-3183.patch, 
> UpdateRecord_Config_Exception.JPG
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> with replacement value strategy as Record Path Value,
> I am trying to replace square bracket symbol *[* with parenthesis symbol *(* 
> in my employeeName column in my csvReader structure with below syntax
> replaceRegex(/employeeName, "[\[]", "(")
> Processor is throwing following exception.
> RecordPathException: java.util.regex.PatternSyntaxException: Unclosed 
> character class near index 4 [\\[]
> It worked fine with other special characters such as \{, }, <, >, ;, _, "
> For double qoute ("), i had to use single escape character, for above listed 
> other characters, worked fine without any escape character. Other folks in 
> Nifi Slack tried \s, \d, \w, \. 
> looks like non of them worked.
> replace function worked for replacing [ and ]characters. didn't test any 
> other characters.
> Please address resolve the issue.
> Regards,
> Ravi
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5874) CSVReader and CSVRecordSetWriter inject transformed backslash sequences from input

2018-12-06 Thread Ed Berezitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711403#comment-16711403
 ] 

Ed Berezitsky commented on NIFI-5874:
-

Attached template to fully reproduce this issue.

Even if initial CSV fully complies with CSV standards and escapes backslash 
with double backslash (\\t), first UpdateRecord will write output with single 
backslash, and then next UpdateRecord will convert it into the tab.

> CSVReader and CSVRecordSetWriter inject transformed backslash sequences from 
> input
> --
>
> Key: NIFI-5874
> URL: https://issues.apache.org/jira/browse/NIFI-5874
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Priority: Major
> Attachments: csv_bug.xml
>
>
> If there is backslash sequence (like \t, \n, etc) in the input, 
> CSVRecordSetWriter transforms them into actual characters (new line, tab, 
> etc) in output record.
> For example, input record:
>  
> {code:java}
> case,a,a1
> tab,=\t=,-
> {code}
>  
> Update Record with `/a1: /a` (just copy value from one field to another)
> JsonRecordSetWriter will produce:
> {code:java}
> [{"case":"tab","a":"=\t=","a1":"=\t="}]{code}
> and CSVRecordSetWriter will produce:
> {code:java}
> case,a,a1
> tab,= =,= =
> {code}
> there is a actual "tab" in between "="
>  In JSON objecr above, \t mean escaped tab. The actual issue is coming from 
> both CSV Reader and Writer.
> Reader converts unescaped sequence of characters into actual character, but 
> Writer doesn't escape them back when writes results, while JSON Writer does 
> that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5874) CSVReader and CSVRecordSetWriter inject transformed backslash sequences from input

2018-12-06 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5874:

 Attachment: csv_bug.xml
Description: 
If there is backslash sequence (like \t, \n, etc) in the input, 
CSVRecordSetWriter transforms them into actual characters (new line, tab, etc) 
in output record.

For example, input record:

 
{code:java}
case,a,a1
tab,=\t=,-
{code}
 

Update Record with `/a1: /a` (just copy value from one field to another)

JsonRecordSetWriter will produce:
{code:java}
[{"case":"tab","a":"=\t=","a1":"=\t="}]{code}
and CSVRecordSetWriter will produce:
{code:java}
case,a,a1
tab,= =,= =
{code}
there is a actual "tab" in between "="

 In JSON objecr above, \t mean escaped tab. The actual issue is coming from 
both CSV Reader and Writer.

Reader converts unescaped sequence of characters into actual character, but 
Writer doesn't escape them back when writes results, while JSON Writer does 
that.

  was:
If there is backslash sequence (like \t, \n, etc) in the input, 
CSVRecordSetWriter transforms them into actual characters (new line, tab, etc) 
in output record.

For example, input record:

 
{code:java}
case,a,a1
tab,=\t=,-
{code}
 

Update Record with `/a1: /a` (just copy value from one field to another)

JsonRecordSetWriter will produce:
{code:java}
[{"case":"tab","a":"=\t=","a1":"=\t="}]{code}
while CSVRecordSetWriter will produce:
{code:java}
case,a,a1
tab,= =,= =
{code}
there is a actual "tab" in between "="

 

Summary: CSVReader and CSVRecordSetWriter inject transformed backslash 
sequences from input  (was: CSVRecordSetWriter inject transformed backslash 
sequences from input)

Further digging into the issue shows that there are both CSV Reader and Writer 
causing this issue.

> CSVReader and CSVRecordSetWriter inject transformed backslash sequences from 
> input
> --
>
> Key: NIFI-5874
> URL: https://issues.apache.org/jira/browse/NIFI-5874
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Priority: Major
> Attachments: csv_bug.xml
>
>
> If there is backslash sequence (like \t, \n, etc) in the input, 
> CSVRecordSetWriter transforms them into actual characters (new line, tab, 
> etc) in output record.
> For example, input record:
>  
> {code:java}
> case,a,a1
> tab,=\t=,-
> {code}
>  
> Update Record with `/a1: /a` (just copy value from one field to another)
> JsonRecordSetWriter will produce:
> {code:java}
> [{"case":"tab","a":"=\t=","a1":"=\t="}]{code}
> and CSVRecordSetWriter will produce:
> {code:java}
> case,a,a1
> tab,= =,= =
> {code}
> there is a actual "tab" in between "="
>  In JSON objecr above, \t mean escaped tab. The actual issue is coming from 
> both CSV Reader and Writer.
> Reader converts unescaped sequence of characters into actual character, but 
> Writer doesn't escape them back when writes results, while JSON Writer does 
> that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5874) CSVRecordSetWriter inject transformed backslash sequences from input

2018-12-05 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5874:

Description: 
If there is backslash sequence (like \t, \n, etc) in the input, 
CSVRecordSetWriter transforms them into actual characters (new line, tab, etc) 
in output record.

For example, input record:

 
{code:java}
case,a,a1
tab,=\t=,-
{code}
 

Update Record with `/a1: /a` (just copy value from one field to another)

JsonRecordSetWriter will produce:
{code:java}
[{"case":"tab","a":"=\t=","a1":"=\t="}]{code}
while CSVRecordSetWriter will produce:
{code:java}
case,a,a1
tab,= =,= =
{code}
there is a actual "tab" in between "="

 

  was:
If there is backslash sequence (like \t, \n, etc) in the input, 
CSVRecordSetWriter transforms them into actual characters (new line, tab, etc) 
in output record.

For example, input record:

 
{code:java}
case,a,a1
tab,=\t=,-
{code}
 

Update Record with `/a1: /a` (just copy value from one field to another)

JsonRecordSetWriter will produce:
{code:java}
[{"case":"tab","a":"=\t=","a1":"=\t="}]{code}
while CSVRecordSetWriter will produce:

 
{code:java}
case,a,a1
tab,= =,= =
{code}
there is a actual "tab" in between "="

 


> CSVRecordSetWriter inject transformed backslash sequences from input
> 
>
> Key: NIFI-5874
> URL: https://issues.apache.org/jira/browse/NIFI-5874
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Priority: Major
>
> If there is backslash sequence (like \t, \n, etc) in the input, 
> CSVRecordSetWriter transforms them into actual characters (new line, tab, 
> etc) in output record.
> For example, input record:
>  
> {code:java}
> case,a,a1
> tab,=\t=,-
> {code}
>  
> Update Record with `/a1: /a` (just copy value from one field to another)
> JsonRecordSetWriter will produce:
> {code:java}
> [{"case":"tab","a":"=\t=","a1":"=\t="}]{code}
> while CSVRecordSetWriter will produce:
> {code:java}
> case,a,a1
> tab,= =,= =
> {code}
> there is a actual "tab" in between "="
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5874) CSVRecordSetWriter inject transformed backslash sequences from input

2018-12-05 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5874:

Description: 
If there is backslash sequence (like \t, \n, etc) in the input, 
CSVRecordSetWriter transforms them into actual characters (new line, tab, etc) 
in output record.

For example, input record:

 
{code:java}
case,a,a1
tab,=\t=,-
{code}
 

Update Record with `/a1: /a` (just copy value from one field to another)

JsonRecordSetWriter will produce:
{code:java}
[{"case":"tab","a":"=\t=","a1":"=\t="}]{code}
while CSVRecordSetWriter will produce:

 
{code:java}
case,a,a1
tab,= =,= =
{code}
there is a actual "tab" in between "="

 

  was:
If there is backslash sequence (like \t, \n, etc) in the input, 
CSVRecordSetWriter transforms them into actual characters (new line, tab, etc) 
in output record.

For example, input record:

 
{code:java}
case,a,a1
tab,=\t=,-
{code}
 

Update Record with `/a1: /a` (just copy value from one field to another)

JsonRecordSetWriter will produce:

[\\{"case":"tab","a":"=\t=","a1":"=\t="}]

while CSVRecordSetWriter will produce:

 
{code:java}
case,a,a1
tab,= =,= =
{code}
there is a actual "tab" in between "="

 


> CSVRecordSetWriter inject transformed backslash sequences from input
> 
>
> Key: NIFI-5874
> URL: https://issues.apache.org/jira/browse/NIFI-5874
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Priority: Major
>
> If there is backslash sequence (like \t, \n, etc) in the input, 
> CSVRecordSetWriter transforms them into actual characters (new line, tab, 
> etc) in output record.
> For example, input record:
>  
> {code:java}
> case,a,a1
> tab,=\t=,-
> {code}
>  
> Update Record with `/a1: /a` (just copy value from one field to another)
> JsonRecordSetWriter will produce:
> {code:java}
> [{"case":"tab","a":"=\t=","a1":"=\t="}]{code}
> while CSVRecordSetWriter will produce:
>  
> {code:java}
> case,a,a1
> tab,= =,= =
> {code}
> there is a actual "tab" in between "="
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5874) CSVRecordSetWriter inject transformed backslash sequences from input

2018-12-05 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5874:

Description: 
If there is backslash sequence (like \t, \n, etc) in the input, 
CSVRecordSetWriter transforms them into actual characters (new line, tab, etc) 
in output record.

For example, input record:

 
{code:java}
case,a,a1
tab,=\t=,-
{code}
 

Update Record with `/a1: /a` (just copy value from one field to another)

JsonRecordSetWriter will produce:

[\\{"case":"tab","a":"=\t=","a1":"=\t="}]

while CSVRecordSetWriter will produce:

 
{code:java}
case,a,a1
tab,= =,= =
{code}
there is a actual "tab" in between "="

 

  was:
If there is backslash sequence (like \t, \n, etc) in the input, 
CSVRecordSetWriter transforms them into actual characters (new line, tab, etc) 
in output record.

For example, input record:

```

case,a,a1
period,=\t=,-

```

Update Record with `/a1: /a` (just copy value from one field to another)

JsonRecordSetWriter will produce:

```

[\{"case":"period","a":"=\t=","a1":"=\t="}]

```

while CSVRecordSetWriter will produce:

```

case,a,a1
period,= =,= =

```

there is a actual "tab" in between "="

 


> CSVRecordSetWriter inject transformed backslash sequences from input
> 
>
> Key: NIFI-5874
> URL: https://issues.apache.org/jira/browse/NIFI-5874
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Priority: Major
>
> If there is backslash sequence (like \t, \n, etc) in the input, 
> CSVRecordSetWriter transforms them into actual characters (new line, tab, 
> etc) in output record.
> For example, input record:
>  
> {code:java}
> case,a,a1
> tab,=\t=,-
> {code}
>  
> Update Record with `/a1: /a` (just copy value from one field to another)
> JsonRecordSetWriter will produce:
> [\\{"case":"tab","a":"=\t=","a1":"=\t="}]
> while CSVRecordSetWriter will produce:
>  
> {code:java}
> case,a,a1
> tab,= =,= =
> {code}
> there is a actual "tab" in between "="
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5874) CSVRecordSetWriter inject transformed backslash sequences from input

2018-12-05 Thread Ed Berezitsky (JIRA)
Ed Berezitsky created NIFI-5874:
---

 Summary: CSVRecordSetWriter inject transformed backslash sequences 
from input
 Key: NIFI-5874
 URL: https://issues.apache.org/jira/browse/NIFI-5874
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Affects Versions: 1.8.0
Reporter: Ed Berezitsky


If there is backslash sequence (like \t, \n, etc) in the input, 
CSVRecordSetWriter transforms them into actual characters (new line, tab, etc) 
in output record.

For example, input record:

```

case,a,a1
period,=\t=,-

```

Update Record with `/a1: /a` (just copy value from one field to another)

JsonRecordSetWriter will produce:

```

[\{"case":"period","a":"=\t=","a1":"=\t="}]

```

while CSVRecordSetWriter will produce:

```

case,a,a1
period,= =,= =

```

there is a actual "tab" in between "="

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5869) JMS Connection Fails After JMS servers Change behind JNDI

2018-12-04 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5869:

Description: 
JMS Connection Fails After JMS servers Change behind JNDI.

Reproduce:
 # Define and enable JNDI Controller Service
 # Create a flow with ConsumeJMS or PublishJMS processors with controller 
service defined in #1.
 # Consume and publish at least one message to ensure the connectivity can be 
established.
 # Change JNDI configuration for the same connection factory to point to new 
JMS servers.
 # Stop JMS service on previous servers
 # Observe failure in ConsumeJMS/PublishJMS (Caused by: javax.jms.JMSException: 
Failed to connect to any server at: tcp://jms_server1:12345)

 

Work Around:
 # Disable JNDI Controller Service
 # Enable JNDI Controller Service and dependent processors.

 

Possible Issue/Fix:
 * AbstractJMSProcessor has a method "buildTargetResource", in which connection 
factory is instantiated and then cached in workerPool in onTrigger .
 * Issue: Once cached, it will be reused forever.
 * Fix: on connectivity failure there should be an attempt to rebuild the 
worker. 

  was:
JMS Connection Fails After JMS servers Change behind JNDI.

Reproduce:
 # Define and enable JNDI Controller Service
 # Create a flow with ConsumeJMS or PublishJMS processors with controller 
service defined in #1.
 # Consume and publish at least one message to ensure the connectivity can be 
established.
 # Change JNDI configuration for the same connection factory to point to new 
JMS servers.
 # Stop JMS service on previous servers
 # Observe failure in ConsumeJMS/PublishJMS (Caused by: javax.jms.JMSException: 
Failed to connect to any server at: tcp://jms_server1:12345)

 

Work Around:
 # Disable JNDI Controller Service
 # Enable JNDI Controller Service and dependent processors.

 

Possible Issue/Fix:
 * AbstractJMSProcessor has a method "buildTargetResource", in which connection 
factory in instantiated and then cached in workerPool in onTrigger .
 * Issues: Once cached, it will be reused forever.
 * Fix: on connectivity failure there should be an attempt to rebuild the 
worker. 


> JMS Connection Fails After JMS servers Change behind JNDI
> -
>
> Key: NIFI-5869
> URL: https://issues.apache.org/jira/browse/NIFI-5869
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
> Fix For: 1.8.0
>
> Attachments: JNDI_JMS_Exception.txt
>
>
> JMS Connection Fails After JMS servers Change behind JNDI.
> Reproduce:
>  # Define and enable JNDI Controller Service
>  # Create a flow with ConsumeJMS or PublishJMS processors with controller 
> service defined in #1.
>  # Consume and publish at least one message to ensure the connectivity can be 
> established.
>  # Change JNDI configuration for the same connection factory to point to new 
> JMS servers.
>  # Stop JMS service on previous servers
>  # Observe failure in ConsumeJMS/PublishJMS (Caused by: 
> javax.jms.JMSException: Failed to connect to any server at: 
> tcp://jms_server1:12345)
>  
> Work Around:
>  # Disable JNDI Controller Service
>  # Enable JNDI Controller Service and dependent processors.
>  
> Possible Issue/Fix:
>  * AbstractJMSProcessor has a method "buildTargetResource", in which 
> connection factory is instantiated and then cached in workerPool in onTrigger 
> .
>  * Issue: Once cached, it will be reused forever.
>  * Fix: on connectivity failure there should be an attempt to rebuild the 
> worker. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5869) JMS Connection Fails After JMS servers Change behind JNDI

2018-12-04 Thread Ed Berezitsky (JIRA)
Ed Berezitsky created NIFI-5869:
---

 Summary: JMS Connection Fails After JMS servers Change behind JNDI
 Key: NIFI-5869
 URL: https://issues.apache.org/jira/browse/NIFI-5869
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Affects Versions: 1.8.0
Reporter: Ed Berezitsky
Assignee: Ed Berezitsky
 Fix For: 1.8.0
 Attachments: JNDI_JMS_Exception.txt

JMS Connection Fails After JMS servers Change behind JNDI.

Reproduce:
 # Define and enable JNDI Controller Service
 # Create a flow with ConsumeJMS or PublishJMS processors with controller 
service defined in #1.
 # Consume and publish at least one message to ensure the connectivity can be 
established.
 # Change JNDI configuration for the same connection factory to point to new 
JMS servers.
 # Stop JMS service on previous servers
 # Observe failure in ConsumeJMS/PublishJMS (Caused by: javax.jms.JMSException: 
Failed to connect to any server at: tcp://jms_server1:12345)

 

Work Around:
 # Disable JNDI Controller Service
 # Enable JNDI Controller Service and dependent processors.

 

Possible Issue/Fix:
 * AbstractJMSProcessor has a method "buildTargetResource", in which connection 
factory in instantiated and then cached in workerPool in onTrigger .
 * Issues: Once cached, it will be reused forever.
 * Fix: on connectivity failure there should be an attempt to rebuild the 
worker. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5856) Add capability to assign available matching controller services to processors during import from registry

2018-11-29 Thread Ed Berezitsky (JIRA)
Ed Berezitsky created NIFI-5856:
---

 Summary: Add capability to assign available matching controller 
services to processors during import from registry
 Key: NIFI-5856
 URL: https://issues.apache.org/jira/browse/NIFI-5856
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Affects Versions: 1.8.0
Reporter: Ed Berezitsky


As a user I would like to reduce manual configuration of components after 
importing flows from NIFI registry.

Use cases:
 * a component uses controller service(s) defined in a scope of *parent* (or 
higher) level (e.g. record-based processors, DB pools, etc) can have 
controllers assigned by default, if ID registered is not available (versioned 
from another NIFI instance)
 * a controller service that is in a scope of imported flow uses another 
controller in a scope  of *parent* (or higher) level (e.g. Record 
readers/writer using schema registry).

Current state:
 * a lookup for a controller service is done by ID. If ID is not found, a 
controller won't be assigned and property of a processor/controller will stay 
blank and will require manual configuration/selection

Specifications/Requirements:
 * Change current behavior to enable default assignment of controller services 
to processor/controller property in case desired controller service cannot be 
found by ID.
 * in order to reduce wrong automatic assignments, both type and name of a 
controller service should be considered. 
 * Since names aren't unique, have a NIFI property to specify strict and 
nonstrict policy for multi-match:
 ** strict mode will restrict automatic assignment of controller service, and 
property in the processor/controller will stay blank (as per current 
specification).
 ** nonstrict mode will allow any of matching controllers to be assigned (first 
found).

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5826) UpdateRecord processor throwing PatternSyntaxException

2018-11-27 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5826:

Affects Version/s: 1.6.0
   1.7.0
   1.8.0
   1.7.1
   Attachment: NIFI-5826_PR-3183.patch
   Status: Patch Available  (was: In Progress)

> UpdateRecord processor throwing PatternSyntaxException
> --
>
> Key: NIFI-5826
> URL: https://issues.apache.org/jira/browse/NIFI-5826
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.7.1, 1.8.0, 1.7.0, 1.6.0, 1.5.0
> Environment: Nifi in docker container
>Reporter: ravi kargam
>Assignee: Ed Berezitsky
>Priority: Minor
> Attachments: NIFI-5826_PR-3183.patch, 
> UpdateRecord_Config_Exception.JPG
>
>
> with replacement value strategy as Record Path Value,
> I am trying to replace square bracket symbol *[* with parenthesis symbol *(* 
> in my employeeName column in my csvReader structure with below syntax
> replaceRegex(/employeeName, "[\[]", "(")
> Processor is throwing following exception.
> RecordPathException: java.util.regex.PatternSyntaxException: Unclosed 
> character class near index 4 [\\[]
> It worked fine with other special characters such as \{, }, <, >, ;, _, "
> For double qoute ("), i had to use single escape character, for above listed 
> other characters, worked fine without any escape character. Other folks in 
> Nifi Slack tried \s, \d, \w, \. 
> looks like non of them worked.
> replace function worked for replacing [ and ]characters. didn't test any 
> other characters.
> Please address resolve the issue.
> Regards,
> Ravi
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5832) PutHiveQL - Flowfile isn't transferred to failure rel on actual failure

2018-11-20 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5832:

Description: 
PutHiveQL is stuck if error occurred when flow file contains multiple 
statements.

Example:

 
{code:java}
set tez.queue.name=qwe;
create table t_table1 (s string) stored as orc;{code}
 

This will fail if such queue doesn't exist. But FF will be stuck in incoming 
connection forever without even emitting bulletin (bulletin will appear only 
when the processor is in debug mode).

Another example:
{code:java}
insert into table t_table1 select 'test' from test limit 1;
insert into table non_existing_table select * from another_table;{code}
Note, first statement is correct one, second should fail.

  was:
PutHiveQL is stuck if error occurred when flow file contains multiple 
statements.

Example:

 
{code:java}
 
set tez.queue.name=qwe;
create table t_table1 (s string) stored as orc;{code}
 

This will fail if such queue doesn't exist. But FF will be stuck in incoming 
connection forever without even emitting bulletin (bulletin will appear only 
when the processor is in debug mode).

Another example:
{code:java}
insert into table t_table1 select 'test' from test limit 1;
insert into table non_existing_table select * from another_table;{code}
Note, first statement is correct one, second should fail.


> PutHiveQL - Flowfile isn't transferred to failure rel on actual failure
> ---
>
> Key: NIFI-5832
> URL: https://issues.apache.org/jira/browse/NIFI-5832
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>
> PutHiveQL is stuck if error occurred when flow file contains multiple 
> statements.
> Example:
>  
> {code:java}
> set tez.queue.name=qwe;
> create table t_table1 (s string) stored as orc;{code}
>  
> This will fail if such queue doesn't exist. But FF will be stuck in incoming 
> connection forever without even emitting bulletin (bulletin will appear only 
> when the processor is in debug mode).
> Another example:
> {code:java}
> insert into table t_table1 select 'test' from test limit 1;
> insert into table non_existing_table select * from another_table;{code}
> Note, first statement is correct one, second should fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-5826) UpdateRecord processor throwing PatternSyntaxException

2018-11-15 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky reassigned NIFI-5826:
---

Assignee: Ed Berezitsky

> UpdateRecord processor throwing PatternSyntaxException
> --
>
> Key: NIFI-5826
> URL: https://issues.apache.org/jira/browse/NIFI-5826
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.5.0
> Environment: Nifi in docker container
>Reporter: ravi kargam
>Assignee: Ed Berezitsky
>Priority: Minor
> Attachments: UpdateRecord_Config_Exception.JPG
>
>
> with replacement value strategy as Record Path Value,
> I am trying to replace square bracket symbol *[* with parenthesis symbol *(* 
> in my employeeName column in my csvReader structure with below syntax
> replaceRegex(/employeeName, "[\[]", "(")
> Processor is throwing following exception.
> RecordPathException: java.util.regex.PatternSyntaxException: Unclosed 
> character class near index 4 [\\[]
> It worked fine with other special characters such as \{, }, <, >, ;, _, "
> For double qoute ("), i had to use single escape character, for above listed 
> other characters, worked fine without any escape character. Other folks in 
> Nifi Slack tried \s, \d, \w, \. 
> looks like non of them worked.
> replace function worked for replacing [ and ]characters. didn't test any 
> other characters.
> Please address resolve the issue.
> Regards,
> Ravi
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5810) Add EL support to User Name property in ConsumeJMS and PublishJMS

2018-11-09 Thread Ed Berezitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681926#comment-16681926
 ] 

Ed Berezitsky commented on NIFI-5810:
-

Not adding GetJMSQueue, GetJMSTopic and PutJMS is intentional. These processors 
are deprecated anyway. 

> Add EL support to User Name property in ConsumeJMS and PublishJMS
> -
>
> Key: NIFI-5810
> URL: https://issues.apache.org/jira/browse/NIFI-5810
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>
> ConsumeJMS and PublishJMS don't support EL for user name property.
> As a result, username should be hard coded, so updating this property per 
> environment makes flows not version-able.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5810) Add EL support to User Name property in ConsumeJMS and PublishJMS

2018-11-09 Thread Ed Berezitsky (JIRA)
Ed Berezitsky created NIFI-5810:
---

 Summary: Add EL support to User Name property in ConsumeJMS and 
PublishJMS
 Key: NIFI-5810
 URL: https://issues.apache.org/jira/browse/NIFI-5810
 Project: Apache NiFi
  Issue Type: Improvement
Affects Versions: 1.8.0
Reporter: Ed Berezitsky
Assignee: Ed Berezitsky


ConsumeJMS and PublishJMS don't support EL for user name property.

As a result, username should be hard coded, so updating this property per 
environment makes flows not version-able.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5782) Masking Sensitive Properties in UI Textbox

2018-11-01 Thread Ed Berezitsky (JIRA)
Ed Berezitsky created NIFI-5782:
---

 Summary: Masking Sensitive Properties in UI Textbox
 Key: NIFI-5782
 URL: https://issues.apache.org/jira/browse/NIFI-5782
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Core UI
Affects Versions: 1.8.0
Reporter: Ed Berezitsky


As UI user, I would like to be able to insert sensitive values in secure way.

Currently, UI does not show values of already defined properties, showing 
"Sensitive Value Set".

But when user needs to set new or change existing value, text is printed in 
plain text.

This should be changed to acceptable in industry format: masked with *. 
Optionally "Show value" can be added to unmask for validation if needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (NIFI-5770) Memory Leak in ExecuteScript

2018-10-31 Thread Ed Berezitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670477#comment-16670477
 ] 

Ed Berezitsky edited comment on NIFI-5770 at 10/31/18 6:26 PM:
---

[~ivanomarot], please confirm the version you are facing this issue in.

As you describe it, the issue has been reported and fixed as part of 
https://issues.apache.org/jira/browse/NIFI-4968 .

Fix is available since v 1.6.

I also tried to reproduce with bad syntax, but it gives only single error in 
log+bulletin+processor validation indicator, and until you change any property, 
it won't be running validations anymore.


was (Author: bdesert):
[~ivanomarot], please confirm the version you are facing this issue in.

As you describe it, the issue has been reported and fixed as part of 
https://issues.apache.org/jira/browse/NIFI-4968 .

Fixed is available since v 1.6.

I also tried to reproduce with bad syntax, but it gives only single error in 
log+bulletin+processor validation indicator, and until you change any property, 
it won't be running validations anymore.

> Memory Leak in ExecuteScript
> 
>
> Key: NIFI-5770
> URL: https://issues.apache.org/jira/browse/NIFI-5770
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, performance
> Attachments: 3117.patch, ExecuteScriptMemLeak.xml, jython_modules.zip
>
>
> ExecuteScript with Jython engine has memory leak.
>  It uses JythonScriptEngineConfigurator class to configure jython execution 
> environment.
>  The problem is in the line:
> {code:java}
> engine.eval("sys.path.append('" + modulePath + "')");{code}
> There is no check if a module has already been added previously.
>  As a result, with each execution (onTrigger), string value of module 
> property is being appended, and never reset.
> Although InvokeScriptedProcessor uses the same engine configurator, memory 
> leak is not reproducable in it,
>  because ISP builds the engine and compile the code only once (and rebuilds 
> every time any relevant property is changed).
>  Attached:
>  * template with a flow to reproduce the bug
>  * simple python modules (to be unpacked under /tmp)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5770) Memory Leak in ExecuteScript

2018-10-31 Thread Ed Berezitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670477#comment-16670477
 ] 

Ed Berezitsky commented on NIFI-5770:
-

[~ivanomarot], please confirm the version you are facing this issue in.

As you describe it, the issue has been reported and fixed as part of 
https://issues.apache.org/jira/browse/NIFI-4968 .

Fixed is available since v 1.6.

I also tried to reproduce with bad syntax, but it gives only single error in 
log+bulletin+processor validation indicator, and until you change any property, 
it won't be running validations anymore.

> Memory Leak in ExecuteScript
> 
>
> Key: NIFI-5770
> URL: https://issues.apache.org/jira/browse/NIFI-5770
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, performance
> Attachments: 3117.patch, ExecuteScriptMemLeak.xml, jython_modules.zip
>
>
> ExecuteScript with Jython engine has memory leak.
>  It uses JythonScriptEngineConfigurator class to configure jython execution 
> environment.
>  The problem is in the line:
> {code:java}
> engine.eval("sys.path.append('" + modulePath + "')");{code}
> There is no check if a module has already been added previously.
>  As a result, with each execution (onTrigger), string value of module 
> property is being appended, and never reset.
> Although InvokeScriptedProcessor uses the same engine configurator, memory 
> leak is not reproducable in it,
>  because ISP builds the engine and compile the code only once (and rebuilds 
> every time any relevant property is changed).
>  Attached:
>  * template with a flow to reproduce the bug
>  * simple python modules (to be unpacked under /tmp)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5770) Memory Leak in ExecuteScript

2018-10-31 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5770:

Attachment: 3117.patch
Status: Patch Available  (was: In Progress)

> Memory Leak in ExecuteScript
> 
>
> Key: NIFI-5770
> URL: https://issues.apache.org/jira/browse/NIFI-5770
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, performance
> Attachments: 3117.patch, ExecuteScriptMemLeak.xml, jython_modules.zip
>
>
> ExecuteScript with Jython engine has memory leak.
>  It uses JythonScriptEngineConfigurator class to configure jython execution 
> environment.
>  The problem is in the line:
> {code:java}
> engine.eval("sys.path.append('" + modulePath + "')");{code}
> There is no check if a module has already been added previously.
>  As a result, with each execution (onTrigger), string value of module 
> property is being appended, and never reset.
> Although InvokeScriptedProcessor uses the same engine configurator, memory 
> leak is not reproducable in it,
>  because ISP builds the engine and compile the code only once (and rebuilds 
> every time any relevant property is changed).
>  Attached:
>  * template with a flow to reproduce the bug
>  * simple python modules (to be unpacked under /tmp)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5770) Memory Leak in ExecuteScript

2018-10-30 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5770:

Description: 
ExecuteScript with Jython engine has memory leak.
 It uses JythonScriptEngineConfigurator class to configure jython execution 
environment.
 The problem is in the line:
{code:java}
engine.eval("sys.path.append('" + modulePath + "')");{code}
There is no check if a module has already been added previously.
 As a result, with each execution (onTrigger), string value of module property 
is being appended, and never reset.

Although InvokeScriptedProcessor uses the same engine configurator, memory leak 
is not reproducable in it,
 because ISP builds the engine and compile the code only once (and rebuilds 
every time any relevant property is changed).

 Attached:
 * template with a flow to reproduce the bug
 * simple python modules (to be unpacked under /tmp)

  was:
ExecuteScript with Jython engine has memory leak.
 It uses JythonScriptEngineConfigurator class to configure jython execution 
environment.
 The problem is in the line:
{code:java}
engine.eval("sys.path.append('" + modulePath + "')");{code}
There is no check if a module has already been added previously.
 As a result, with each execution (onTrigger), string value of module property 
is being appended, and never reset.

Although InvokeScriptedProcessor uses the same engine configurator, memory leak 
is not reproducable in it,
 because ISP builds the engine and compile the code only once (and rebuilds 
every time any relevant property is changed).

 


> Memory Leak in ExecuteScript
> 
>
> Key: NIFI-5770
> URL: https://issues.apache.org/jira/browse/NIFI-5770
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, performance
> Attachments: ExecuteScriptMemLeak.xml, jython_modules.zip
>
>
> ExecuteScript with Jython engine has memory leak.
>  It uses JythonScriptEngineConfigurator class to configure jython execution 
> environment.
>  The problem is in the line:
> {code:java}
> engine.eval("sys.path.append('" + modulePath + "')");{code}
> There is no check if a module has already been added previously.
>  As a result, with each execution (onTrigger), string value of module 
> property is being appended, and never reset.
> Although InvokeScriptedProcessor uses the same engine configurator, memory 
> leak is not reproducable in it,
>  because ISP builds the engine and compile the code only once (and rebuilds 
> every time any relevant property is changed).
>  Attached:
>  * template with a flow to reproduce the bug
>  * simple python modules (to be unpacked under /tmp)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5770) Memory Leak in ExecuteScript

2018-10-30 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5770:

Description: 
ExecuteScript with Jython engine has memory leak.
 It uses JythonScriptEngineConfigurator class to configure jython execution 
environment.
 The problem is in the line:
{code:java}
engine.eval("sys.path.append('" + modulePath + "')");{code}
There is no check if a module has already been added previously.
 As a result, with each execution (onTrigger), string value of module property 
is being appended, and never reset.

Although InvokeScriptedProcessor uses the same engine configurator, memory leak 
is not reproducable in it,
 because ISP builds the engine and compile the code only once (and rebuilds 
every time any relevant property is changed).

 

  was:
ExecuteScript with Jython engine has memory leak.
It uses JythonScriptEngineConfigurator class to configure jython execution 
environment.
The problem is in the line:
{code:java}
engine.eval("sys.path.append('" + modulePath + "')");{code}
There is no check if a module has already been added previously.
As a result, with each execution (onTrigger), string value of module property 
is being appended, and never reset.

Although InvokeScriptedProcessor uses the same engine configurator, memory leak 
is not reproducable in it,
because ISP builds the engine and compile the code only once (and rebuilds 
every time any relevant property is changed).

 


> Memory Leak in ExecuteScript
> 
>
> Key: NIFI-5770
> URL: https://issues.apache.org/jira/browse/NIFI-5770
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, performance
> Attachments: ExecuteScriptMemLeak.xml, jython_modules.zip
>
>
> ExecuteScript with Jython engine has memory leak.
>  It uses JythonScriptEngineConfigurator class to configure jython execution 
> environment.
>  The problem is in the line:
> {code:java}
> engine.eval("sys.path.append('" + modulePath + "')");{code}
> There is no check if a module has already been added previously.
>  As a result, with each execution (onTrigger), string value of module 
> property is being appended, and never reset.
> Although InvokeScriptedProcessor uses the same engine configurator, memory 
> leak is not reproducable in it,
>  because ISP builds the engine and compile the code only once (and rebuilds 
> every time any relevant property is changed).
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5770) Memory Leak in ExecuteScript

2018-10-30 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5770:

Attachment: jython_modules.zip

> Memory Leak in ExecuteScript
> 
>
> Key: NIFI-5770
> URL: https://issues.apache.org/jira/browse/NIFI-5770
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, performance
> Attachments: ExecuteScriptMemLeak.xml, jython_modules.zip
>
>
> ExecuteScript with Jython engine has memory leak.
> It uses JythonScriptEngineConfigurator class to configure jython execution 
> environment.
> The problem is in the line:
> {code:java}
> engine.eval("sys.path.append('" + modulePath + "')");{code}
> There is no check if a module has already been added previously.
> As a result, with each execution (onTrigger), string value of module property 
> is being appended, and never reset.
> Although InvokeScriptedProcessor uses the same engine configurator, memory 
> leak is not reproducable in it,
> because ISP builds the engine and compile the code only once (and rebuilds 
> every time any relevant property is changed).
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5770) Memory Leak in ExecuteScript

2018-10-30 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5770:

Attachment: ExecuteScriptMemLeak.xml

> Memory Leak in ExecuteScript
> 
>
> Key: NIFI-5770
> URL: https://issues.apache.org/jira/browse/NIFI-5770
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, performance
> Attachments: ExecuteScriptMemLeak.xml, jython_modules.zip
>
>
> ExecuteScript with Jython engine has memory leak.
> It uses JythonScriptEngineConfigurator class to configure jython execution 
> environment.
> The problem is in the line:
> {code:java}
> engine.eval("sys.path.append('" + modulePath + "')");{code}
> There is no check if a module has already been added previously.
> As a result, with each execution (onTrigger), string value of module property 
> is being appended, and never reset.
> Although InvokeScriptedProcessor uses the same engine configurator, memory 
> leak is not reproducable in it,
> because ISP builds the engine and compile the code only once (and rebuilds 
> every time any relevant property is changed).
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5770) Memory Leak in ExecuteScript

2018-10-30 Thread Ed Berezitsky (JIRA)
Ed Berezitsky created NIFI-5770:
---

 Summary: Memory Leak in ExecuteScript
 Key: NIFI-5770
 URL: https://issues.apache.org/jira/browse/NIFI-5770
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core Framework
Affects Versions: 1.8.0
Reporter: Ed Berezitsky
Assignee: Ed Berezitsky


ExecuteScript with Jython engine has memory leak.
It uses JythonScriptEngineConfigurator class to configure jython execution 
environment.
The problem is in the line:
{code:java}
engine.eval("sys.path.append('" + modulePath + "')");{code}
There is no check if a module has already been added previously.
As a result, with each execution (onTrigger), string value of module property 
is being appended, and never reset.

Although InvokeScriptedProcessor uses the same engine configurator, memory leak 
is not reproducable in it,
because ISP builds the engine and compile the code only once (and rebuilds 
every time any relevant property is changed).

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5728) Inconsistent behavior in XMLRecordSetWriter for Root Record Tag

2018-10-19 Thread Ed Berezitsky (JIRA)
Ed Berezitsky created NIFI-5728:
---

 Summary: Inconsistent behavior in XMLRecordSetWriter for Root 
Record Tag
 Key: NIFI-5728
 URL: https://issues.apache.org/jira/browse/NIFI-5728
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Affects Versions: 1.7.1
Reporter: Ed Berezitsky
Assignee: Ed Berezitsky


In XMLRecordSetWriter:
When used with "Use Schema Text" and "Name of Record Tag" is empty (so record 
name should be used to wrapping XML tag) - works correctly.
When used with Schema Registry and "Name of Record Tag" is empty - it doesn't 
write record name, but uses schema name instead. I believe this inconsistency 
is coming from the fact that when schema in a writer defined by "Use Schema 
Text" - it doesn't have a name and then name is taken from a record. But when 
it comes from registry - it simply uses schema idetifier:
{code:java}
recordSchema.getIdentifier().getName();
{code}
IMO root record name should be used in this case instead of schema name.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (NIFI-4805) allow delayed transfer

2018-10-17 Thread Ed Berezitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16653790#comment-16653790
 ] 

Ed Berezitsky edited comment on NIFI-4805 at 10/17/18 4:29 PM:
---

[~patricker],

if you go with two processors solution - then I would suggest to remove 
"retryAttrName" from PenalizeFlowFile Processor implementation. Because you 
indeed do not implement retry. So that was misleading in the code.


was (Author: bdesert):
[~patricker],

if the you go with two processors solution - then I would suggest to remove 
"retryAttrName" from PenalizeFlowFile Processor implementation. Because you 
indeed do not implement retry. So that was misleading in the code.

> allow delayed transfer
> --
>
> Key: NIFI-4805
> URL: https://issues.apache.org/jira/browse/NIFI-4805
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Martin Mucha
>Assignee: Peter Wicks
>Priority: Minor
> Attachments: retry.xml
>
>
> Nifi has concept of penalization, but this penalization has fixed delay, and 
> there isn't way how to change it dynamically. 
> If we want to implement retry flow, where FlowFile flows in loop, we can 
> either lower performance of Processor via yielding it, or we can do active 
> waiting. And this is actually recommended as a correct way how to do that.
> It seems, that we can easily implement better RetryProcessor, all we missing 
> is `session.penalize` which accepts `penalizationPeriod`. Processor then can 
> gradually prolong waiting time after each failure.
>  
> Would it be possible to make such method visible?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4805) allow delayed transfer

2018-10-17 Thread Ed Berezitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16653790#comment-16653790
 ] 

Ed Berezitsky commented on NIFI-4805:
-

[~patricker],

if the you go with two processors solution - then I would suggest to remove 
"retryAttrName" from PenalizeFlowFile Processor implementation. Because you 
indeed do not implement retry. So that was misleading in the code.

> allow delayed transfer
> --
>
> Key: NIFI-4805
> URL: https://issues.apache.org/jira/browse/NIFI-4805
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Martin Mucha
>Assignee: Peter Wicks
>Priority: Minor
> Attachments: retry.xml
>
>
> Nifi has concept of penalization, but this penalization has fixed delay, and 
> there isn't way how to change it dynamically. 
> If we want to implement retry flow, where FlowFile flows in loop, we can 
> either lower performance of Processor via yielding it, or we can do active 
> waiting. And this is actually recommended as a correct way how to do that.
> It seems, that we can easily implement better RetryProcessor, all we missing 
> is `session.penalize` which accepts `penalizationPeriod`. Processor then can 
> gradually prolong waiting time after each failure.
>  
> Would it be possible to make such method visible?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (NIFI-4805) allow delayed transfer

2018-10-17 Thread Ed Berezitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16653454#comment-16653454
 ] 

Ed Berezitsky edited comment on NIFI-4805 at 10/17/18 12:32 PM:


Some time ago I have implemented such functionality using scripted processor.

I've designed it as following:
 * attribute for counter with unique ID (similar to Mark's suggestion)
 * max number of retries (null/empty for infinite)
 * input property for sequence of times to penalize. For instance, "1,60, 3600" 
- means: first time wait 1 sec, seconds time - 1 min, third time - 1 hour (and 
same for each additional retry till max). I didn't want to make it exponential 
infinite because at some point you don't want to gradually increase the waiting 
time, it just doesn't make sense.

My relationships were defined as following:
 * retry - continue (aka success) after penalization
 * expired - continue after max retry count reached (also was dropping counter 
to 0 to avoid wrapping loops

This implementation also gave me a simple "PenalizeFlowFile" functionality by 
simply not looping, or specifying max count to 0

My reasons were to minimize number of processors for a retry flow, and make it 
simple as much as possible.

 

[~alopresto], [~alfonz], [~markap14], [~patricker],

what do you think about this design?

(tagged [~patricker] as he implemented initial version, which I think needs to 
be enhanced to include at least features that have been discussed above by 
Andy, Martin and Mark).

 


was (Author: bdesert):
Some time ago I have implemented such functionality using scripted processor.

I've designed it as following:
 * attribute for counter with unique ID (similar to Mark's suggestion)
 * max number of retries (null/empty for infinite)
 * input property for sequence of times to penalize. For instance, "1,60, 3600" 
- means: first time wait 1 sec, seconds time - 1 min, third time - 1 hour (and 
same for each additional retry till max). I didn't want to make it exponential 
infinite because at some point you don't want to gradually increase the waiting 
time, it just doesn't make sense.

My relationships were defined as following:
 * retry - continue (aka success) after penalization
 * expired - continue after max retry count reached (also was dropping counter 
to 0 to avoid wrapping.

This implementation also gave me a simple "PenalizeFlowFile" functionality by 
simply not looping, or specifying max count to 0

My reasons were to minimize number of processors for a retry flow, and make it 
simple as much as possible.

 

[~alopresto], [~alfonz], [~markap14], [~patricker],

what do you think about this design?

(tagged [~patricker] as he implemented initial version, which I think needs to 
be enhanced to include at least features that have been discussed above by 
Andy, Martin and Mark).

 

> allow delayed transfer
> --
>
> Key: NIFI-4805
> URL: https://issues.apache.org/jira/browse/NIFI-4805
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Martin Mucha
>Assignee: Peter Wicks
>Priority: Minor
> Attachments: retry.xml
>
>
> Nifi has concept of penalization, but this penalization has fixed delay, and 
> there isn't way how to change it dynamically. 
> If we want to implement retry flow, where FlowFile flows in loop, we can 
> either lower performance of Processor via yielding it, or we can do active 
> waiting. And this is actually recommended as a correct way how to do that.
> It seems, that we can easily implement better RetryProcessor, all we missing 
> is `session.penalize` which accepts `penalizationPeriod`. Processor then can 
> gradually prolong waiting time after each failure.
>  
> Would it be possible to make such method visible?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4805) allow delayed transfer

2018-10-17 Thread Ed Berezitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16653454#comment-16653454
 ] 

Ed Berezitsky commented on NIFI-4805:
-

Some time ago I have implemented such functionality using scripted processor.

I've designed it as following:
 * attribute for counter with unique ID (similar to Mark's suggestion)
 * max number of retries (null/empty for infinite)
 * input property for sequence of times to penalize. For instance, "1,60, 3600" 
- means: first time wait 1 sec, seconds time - 1 min, third time - 1 hour (and 
same for each additional retry till max). I didn't want to make it exponential 
infinite because at some point you don't want to gradually increase the waiting 
time, it just doesn't make sense.

My relationships were defined as following:
 * retry - continue (aka success) after penalization
 * expired - continue after max retry count reached (also was dropping counter 
to 0 to avoid wrapping.

This implementation also gave me a simple "PenalizeFlowFile" functionality by 
simply not looping, or specifying max count to 0

My reasons were to minimize number of processors for a retry flow, and make it 
simple as much as possible.

 

[~alopresto], [~alfonz], [~markap14], [~patricker],

what do you think about this design?

(tagged [~patricker] as he implemented initial version, which I think needs to 
be enhanced to include at least features that have been discussed above by 
Andy, Martin and Mark).

 

> allow delayed transfer
> --
>
> Key: NIFI-4805
> URL: https://issues.apache.org/jira/browse/NIFI-4805
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Martin Mucha
>Assignee: Peter Wicks
>Priority: Minor
> Attachments: retry.xml
>
>
> Nifi has concept of penalization, but this penalization has fixed delay, and 
> there isn't way how to change it dynamically. 
> If we want to implement retry flow, where FlowFile flows in loop, we can 
> either lower performance of Processor via yielding it, or we can do active 
> waiting. And this is actually recommended as a correct way how to do that.
> It seems, that we can easily implement better RetryProcessor, all we missing 
> is `session.penalize` which accepts `penalizationPeriod`. Processor then can 
> gradually prolong waiting time after each failure.
>  
> Would it be possible to make such method visible?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5706) Processor ConvertAvroToParquet

2018-10-17 Thread Ed Berezitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16653417#comment-16653417
 ] 

Ed Berezitsky commented on NIFI-5706:
-

+1 for Parquet Record Writer

Pros: applicable to any format having provided (or scripted) record reader. 

> Processor ConvertAvroToParquet 
> ---
>
> Key: NIFI-5706
> URL: https://issues.apache.org/jira/browse/NIFI-5706
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Affects Versions: 1.7.1
>Reporter: Mohit
>Priority: Major
>  Labels: pull-request-available
>
> *Why*?
> PutParquet support is limited to HDFS. 
> PutParquet bypasses the _flowfile_ implementation and writes the file 
> directly to sink. 
> We need a processor for parquet that works like _ConvertAvroToOrc_.
> *What*?
> _ConvertAvroToParquet_ will convert the incoming avro flowfile to a parquet 
> flowfile. Unlike PutParquet, which writes to the hdfs file system, processor 
> ConvertAvroToParquet would write into the flowfile, which can be pipelined to 
> put into other sinks, like _local_, _S3, Azure data lake_ etc.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5492) UDF in Expression Language

2018-10-15 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5492:

Issue Type: New Feature  (was: Wish)

> UDF in Expression Language
> --
>
> Key: NIFI-5492
> URL: https://issues.apache.org/jira/browse/NIFI-5492
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Core Framework
>Affects Versions: 1.5.0, 1.6.0, 1.7.0, 1.7.1
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, patch
> Attachments: 0001-NIFI-5492_EXEC-Adding-UDF-to-EL.patch
>
>
> Set of functions available to use in expression language is limited by 
> predefined ones.
> This request is to provide an ability to plug in custom/user defined 
> functions.
> For example:
> ${*exec*('com.example.MyUDF', 'param1', 'param2')}
> Should be able to support:
>  # Multiple, not limited number of parameters (including zero params)
>  # Param data types should  support all EL data types (dates, whole numbers, 
> decimals, strings, booleans)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5492) UDF in Expression Language

2018-09-18 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5492:

   Labels: features patch ready-to-commit  (was: features)
Affects Version/s: 1.5.0
   1.6.0
   1.7.0
   1.7.1
   Attachment: 0001-NIFI-5492_EXEC-Adding-UDF-to-EL.patch
   Status: Patch Available  (was: In Progress)

> UDF in Expression Language
> --
>
> Key: NIFI-5492
> URL: https://issues.apache.org/jira/browse/NIFI-5492
> Project: Apache NiFi
>  Issue Type: Wish
>  Components: Core Framework
>Affects Versions: 1.7.1, 1.7.0, 1.6.0, 1.5.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, patch, ready-to-commit
> Attachments: 0001-NIFI-5492_EXEC-Adding-UDF-to-EL.patch
>
>
> Set of functions available to use in expression language is limited by 
> predefined ones.
> This request is to provide an ability to plug in custom/user defined 
> functions.
> For example:
> ${*exec*('com.example.MyUDF', 'param1', 'param2')}
> Should be able to support:
>  # Multiple, not limited number of parameters (including zero params)
>  # Param data types should  support all EL data types (dates, whole numbers, 
> decimals, strings, booleans)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5492) UDF in Expression Language

2018-09-18 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5492:

Labels: features patch  (was: features patch ready-to-commit)

> UDF in Expression Language
> --
>
> Key: NIFI-5492
> URL: https://issues.apache.org/jira/browse/NIFI-5492
> Project: Apache NiFi
>  Issue Type: Wish
>  Components: Core Framework
>Affects Versions: 1.5.0, 1.6.0, 1.7.0, 1.7.1
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, patch
> Attachments: 0001-NIFI-5492_EXEC-Adding-UDF-to-EL.patch
>
>
> Set of functions available to use in expression language is limited by 
> predefined ones.
> This request is to provide an ability to plug in custom/user defined 
> functions.
> For example:
> ${*exec*('com.example.MyUDF', 'param1', 'param2')}
> Should be able to support:
>  # Multiple, not limited number of parameters (including zero params)
>  # Param data types should  support all EL data types (dates, whole numbers, 
> decimals, strings, booleans)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5492) UDF in Expression Language

2018-08-06 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5492:

Description: 
Set of functions available to use in expression language is limited by 
predefined ones.

This request is to provide an ability to plug in custom/user defined functions.

For example:

${*exec*('com.example.MyUDF', 'param1', 'param2')}

Should be able to support:
 # Multiple, not limited number of parameters (including zero params)
 # Param data types should  support all EL data types (dates, whole numbers, 
decimals, strings, booleans)

 

  was:
Set of functions available to use in expression language is limited by 
predefined ones.

This request is to provide an ability to plug in custom/user defined functions.

For example:

${{color:#FF}*exec*{color}('com.example.MyUDF', 'param1', 'param2')}

Should be able to support:
 # Multiple, not limited number of parameters (including zero params)
 # Param data types should  support all EL data types (dates, whole numbers, 
decimals, strings, booleans)

 


> UDF in Expression Language
> --
>
> Key: NIFI-5492
> URL: https://issues.apache.org/jira/browse/NIFI-5492
> Project: Apache NiFi
>  Issue Type: Wish
>  Components: Core Framework
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features
>
> Set of functions available to use in expression language is limited by 
> predefined ones.
> This request is to provide an ability to plug in custom/user defined 
> functions.
> For example:
> ${*exec*('com.example.MyUDF', 'param1', 'param2')}
> Should be able to support:
>  # Multiple, not limited number of parameters (including zero params)
>  # Param data types should  support all EL data types (dates, whole numbers, 
> decimals, strings, booleans)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5492) UDF in Expression Language

2018-08-06 Thread Ed Berezitsky (JIRA)
Ed Berezitsky created NIFI-5492:
---

 Summary: UDF in Expression Language
 Key: NIFI-5492
 URL: https://issues.apache.org/jira/browse/NIFI-5492
 Project: Apache NiFi
  Issue Type: Wish
  Components: Core Framework
Reporter: Ed Berezitsky
Assignee: Ed Berezitsky


Set of functions available to use in expression language is limited by 
predefined ones.

This request is to provide an ability to plug in custom/user defined functions.

For example:

${{color:#FF}*exec*{color}('com.example.MyUDF', 'param1', 'param2')}

Should be able to support:
 # Multiple, not limited number of parameters (including zero params)
 # Param data types should  support all EL data types (dates, whole numbers, 
decimals, strings, booleans)

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4407) Non-EL statement processed as expression language

2018-05-15 Thread Ed Berezitsky (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476709#comment-16476709
 ] 

Ed Berezitsky commented on NIFI-4407:
-

[~markap14], [~pvillard],

Not clear whether you guys agreed it is a bug or not :). IMO it's not a bug. 
But issue is still opened...

So, improvement can be made: we can add a property "Evaluate Text" (true/false) 
with default as "true", so it will be backward compatible.

Thoughts?

> Non-EL statement processed as expression language
> -
>
> Key: NIFI-4407
> URL: https://issues.apache.org/jira/browse/NIFI-4407
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.1.0, 1.2.0, 1.1.1, 1.0.1, 1.3.0
>Reporter: Pierre Villard
>Priority: Critical
>
> If you take a GFF with custom text: {{test$$foo}}
> The generated text will be: {{test$foo}}
> The property supports expression language and one $ is removed during the EL 
> evaluation step. This can be an issue if a user wants to use a value 
> containing to consecutive $$ (such as in password fields).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (NIFI-5044) SelectHiveQL accept only one statement

2018-05-13 Thread Ed Berezitsky (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16473374#comment-16473374
 ] 

Ed Berezitsky edited comment on NIFI-5044 at 5/14/18 12:51 AM:
---

[~disoardi], [~pvillard] and [~mattyb149],

PR created, take a look please, and let me know if everything is OK.


was (Author: bdesert):
[~disoardi], [~pvillard] and [~mattyb149],

PR create, take a look please, and let me know if everything is OK.

> SelectHiveQL accept only one statement
> --
>
> Key: NIFI-5044
> URL: https://issues.apache.org/jira/browse/NIFI-5044
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.2.0, 1.3.0, 1.4.0, 1.5.0, 1.6.0
>Reporter: Davide Isoardi
>Assignee: Ed Berezitsky
>Priority: Critical
>  Labels: features, patch, pull-request-available
> Attachments: 
> 0001-NIFI-5044-SelectHiveQL-accept-only-one-statement.patch
>
>
> In [this 
> |[https://github.com/apache/nifi/commit/bbc714e73ba245de7bc32fd9958667c847101f7d]
>  ] commit claims to add support to running multiple statements both on 
> SelectHiveQL and PutHiveQL; instead, it adds only the support to PutHiveQL, 
> so SelectHiveQL still lacks this important feature. @Matt Burgess, I saw that 
> you worked on that, is there any reason for this? If not, can we support it?
> If I try to execute this query:
> {quote}set hive.vectorized.execution.enabled = false; SELECT * FROM table_name
> {quote}
> I have this error:
>  
> {quote}2018-04-05 13:35:40,572 ERROR [Timer-Driven Process Thread-146] 
> o.a.nifi.processors.hive.SelectHiveQL 
> SelectHiveQL[id=243d4c17-b1fe-14af--ee8ce15e] Unable to execute 
> HiveQL select query set hive.vectorized.execution.enabled = false; SELECT * 
> FROM table_name for 
> StandardFlowFileRecord[uuid=0e035558-07ce-473b-b0d4-ac00b8b1df93,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1522824912161-2753, 
> container=default, section=705], offset=838441, 
> length=25],offset=0,name=cliente_attributi.csv,size=25] due to 
> org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: 
> The query did not generate a result set!; routing to failure: {}
>  org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: 
> The query did not generate a result set!
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL$2.process(SelectHiveQL.java:305)
>  at 
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2529)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:275)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.lambda$onTrigger$0(SelectHiveQL.java:215)
>  at 
> org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114)
>  at 
> org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:106)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:215)
>  at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1120)
>  at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)
>  at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
>  at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: java.sql.SQLException: The query did not generate a result set!
>  at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:438)
>  at 
> org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at 
> org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL$2.process(SelectHiveQL.java:293)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5044) SelectHiveQL accept only one statement

2018-05-12 Thread Ed Berezitsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5044:

   Labels: features patch pull-request-available  (was: )
Affects Version/s: 1.3.0
   1.4.0
   1.5.0
   1.6.0
   Attachment: 
0001-NIFI-5044-SelectHiveQL-accept-only-one-statement.patch
   Status: Patch Available  (was: In Progress)

> SelectHiveQL accept only one statement
> --
>
> Key: NIFI-5044
> URL: https://issues.apache.org/jira/browse/NIFI-5044
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.6.0, 1.5.0, 1.4.0, 1.3.0, 1.2.0
>Reporter: Davide Isoardi
>Assignee: Ed Berezitsky
>Priority: Critical
>  Labels: patch, pull-request-available, features
> Attachments: 
> 0001-NIFI-5044-SelectHiveQL-accept-only-one-statement.patch
>
>
> In [this 
> |[https://github.com/apache/nifi/commit/bbc714e73ba245de7bc32fd9958667c847101f7d]
>  ] commit claims to add support to running multiple statements both on 
> SelectHiveQL and PutHiveQL; instead, it adds only the support to PutHiveQL, 
> so SelectHiveQL still lacks this important feature. @Matt Burgess, I saw that 
> you worked on that, is there any reason for this? If not, can we support it?
> If I try to execute this query:
> {quote}set hive.vectorized.execution.enabled = false; SELECT * FROM table_name
> {quote}
> I have this error:
>  
> {quote}2018-04-05 13:35:40,572 ERROR [Timer-Driven Process Thread-146] 
> o.a.nifi.processors.hive.SelectHiveQL 
> SelectHiveQL[id=243d4c17-b1fe-14af--ee8ce15e] Unable to execute 
> HiveQL select query set hive.vectorized.execution.enabled = false; SELECT * 
> FROM table_name for 
> StandardFlowFileRecord[uuid=0e035558-07ce-473b-b0d4-ac00b8b1df93,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1522824912161-2753, 
> container=default, section=705], offset=838441, 
> length=25],offset=0,name=cliente_attributi.csv,size=25] due to 
> org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: 
> The query did not generate a result set!; routing to failure: {}
>  org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: 
> The query did not generate a result set!
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL$2.process(SelectHiveQL.java:305)
>  at 
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2529)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:275)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.lambda$onTrigger$0(SelectHiveQL.java:215)
>  at 
> org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114)
>  at 
> org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:106)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:215)
>  at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1120)
>  at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)
>  at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
>  at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: java.sql.SQLException: The query did not generate a result set!
>  at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:438)
>  at 
> org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at 
> org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL$2.process(SelectHiveQL.java:293)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5044) SelectHiveQL accept only one statement

2018-05-12 Thread Ed Berezitsky (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16473374#comment-16473374
 ] 

Ed Berezitsky commented on NIFI-5044:
-

[~disoardi], [~pvillard] and [~mattyb149],

PR create, take a look please, and let me know if everything is OK.

> SelectHiveQL accept only one statement
> --
>
> Key: NIFI-5044
> URL: https://issues.apache.org/jira/browse/NIFI-5044
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Davide Isoardi
>Assignee: Ed Berezitsky
>Priority: Critical
>
> In [this 
> |[https://github.com/apache/nifi/commit/bbc714e73ba245de7bc32fd9958667c847101f7d]
>  ] commit claims to add support to running multiple statements both on 
> SelectHiveQL and PutHiveQL; instead, it adds only the support to PutHiveQL, 
> so SelectHiveQL still lacks this important feature. @Matt Burgess, I saw that 
> you worked on that, is there any reason for this? If not, can we support it?
> If I try to execute this query:
> {quote}set hive.vectorized.execution.enabled = false; SELECT * FROM table_name
> {quote}
> I have this error:
>  
> {quote}2018-04-05 13:35:40,572 ERROR [Timer-Driven Process Thread-146] 
> o.a.nifi.processors.hive.SelectHiveQL 
> SelectHiveQL[id=243d4c17-b1fe-14af--ee8ce15e] Unable to execute 
> HiveQL select query set hive.vectorized.execution.enabled = false; SELECT * 
> FROM table_name for 
> StandardFlowFileRecord[uuid=0e035558-07ce-473b-b0d4-ac00b8b1df93,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1522824912161-2753, 
> container=default, section=705], offset=838441, 
> length=25],offset=0,name=cliente_attributi.csv,size=25] due to 
> org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: 
> The query did not generate a result set!; routing to failure: {}
>  org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: 
> The query did not generate a result set!
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL$2.process(SelectHiveQL.java:305)
>  at 
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2529)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:275)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.lambda$onTrigger$0(SelectHiveQL.java:215)
>  at 
> org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114)
>  at 
> org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:106)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:215)
>  at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1120)
>  at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)
>  at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
>  at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: java.sql.SQLException: The query did not generate a result set!
>  at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:438)
>  at 
> org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at 
> org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL$2.process(SelectHiveQL.java:293)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (NIFI-5141) ValidateRecord considers a record invalid if it has an integer value and schema says double, even if strict type checking is disabled

2018-05-07 Thread Ed Berezitsky (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466738#comment-16466738
 ] 

Ed Berezitsky edited comment on NIFI-5141 at 5/8/18 3:45 AM:
-

[~markap14],

this bug affected two methods: _DataTypeUtils.isFloatTypeCompatible_ and 
_DataTypeUtils.isDoubleTypeCompatible_.

Reproduced for both successfully. After applying patch both methods return 
desired results for format with numbers having digits after period (i.e. 
"13.45") and for integer/long looking number (i.e. "13"). But it still doesn't 
return "true" for numbers that have period, but don't have digits after it, 
i.e. "13.", while Double.parseDouble("13.") parses such values correctly. I 
think, regex for double/floats should support that format as well.

I would recommend to add one more pattern to support that:

 
{code:java}
    private static final String  doubleRegex =
        OptionalSign +
        "(" +
            Infinity + "|" +
            NotANumber + "|"+
            "(" + Base10Digits + OptionalBase10Decimal + ")" + "|" +
            "(" + Base10Digits + "\\." + ")" + "|" +   // recommend to add this 
pattern
            "(" + Base10Digits + OptionalBase10Decimal + Base10Exponent + ")" + 
"|" +
            "(" + Base10Decimal + OptionalBase10Exponent + ")" +
        ")";
 
{code}
Before:

        System.out.println(DataTypeUtils.isFloatTypeCompatible("13")); --> 
{color:#ff}false{color}

        System.out.println(DataTypeUtils.isDoubleTypeCompatible("13")); --> 
{color:#ff}false{color}

        System.out.println(DataTypeUtils.isFloatTypeCompatible("13.")); --> 
{color:#ff}false{color}

        System.out.println(DataTypeUtils.isDoubleTypeCompatible("13.")); --> 
{color:#ff}false{color}

        System.out.println(DataTypeUtils.isFloatTypeCompatible("13.0")); --> 
true

        System.out.println(DataTypeUtils.isDoubleTypeCompatible("13.0")); --> 
true

        System.out.println(DataTypeUtils.isFloatTypeCompatible(".13")); --> true

        System.out.println(DataTypeUtils.isDoubleTypeCompatible(".13")); --> 
true

        System.out.println(DataTypeUtils.isFloatTypeCompatible(".")); --> false

        System.out.println(DataTypeUtils.isDoubleTypeCompatible(".")); --> false

After:

        System.out.println(DataTypeUtils.isFloatTypeCompatible("13")); --> 
{color:#14892c}true{color}

        System.out.println(DataTypeUtils.isDoubleTypeCompatible("13")); --> 
{color:#14892c}true{color}

        System.out.println(DataTypeUtils.isFloatTypeCompatible("13.")); --> 
{color:#d04437}false — recommended update fixes this{color}

        System.out.println(DataTypeUtils.isDoubleTypeCompatible("13.")); --> 
{color:#d04437}false — recommended update fixes this{color}

        System.out.println(DataTypeUtils.isFloatTypeCompatible("13.0")); --> 
true

        System.out.println(DataTypeUtils.isDoubleTypeCompatible("13.0")); --> 
true

        System.out.println(DataTypeUtils.isFloatTypeCompatible(".13")); --> true

        System.out.println(DataTypeUtils.isDoubleTypeCompatible(".13")); --> 
true

        System.out.println(DataTypeUtils.isFloatTypeCompatible(".")); --> false

        System.out.println(DataTypeUtils.isDoubleTypeCompatible(".")); --> false


was (Author: bdesert):
[~markap14],

this bug affected two methods: _DataTypeUtils.isFloatTypeCompatible_ and 
_DataTypeUtils.isDoubleTypeCompatible_.

Reproduced for both successfully. After applying patch both methods return 
desired results for format with numbers having digits after period (i.e. 
"13.45"). But it still doesn't return "true" for numbers that have period, but 
don't have digits after it, i.e. "13.", while Double.parseDouble("13.") parses 
such values correctly. I think, regex for double/floats should support that 
format as well.

I would recommend to add one more pattern to support that:

 
{code:java}
    private static final String  doubleRegex =
        OptionalSign +
        "(" +
            Infinity + "|" +
            NotANumber + "|"+
            "(" + Base10Digits + OptionalBase10Decimal + ")" + "|" +
            "(" + Base10Digits + "\\." + ")" + "|" +   // recommend to add this 
pattern
            "(" + Base10Digits + OptionalBase10Decimal + Base10Exponent + ")" + 
"|" +
            "(" + Base10Decimal + OptionalBase10Exponent + ")" +
        ")";
 
{code}
Before:

        System.out.println(DataTypeUtils.isFloatTypeCompatible("13")); --> 
{color:#FF}false{color}

        System.out.println(DataTypeUtils.isDoubleTypeCompatible("13")); --> 
{color:#FF}false{color}

        System.out.println(DataTypeUtils.isFloatTypeCompatible("13.")); --> 
{color:#FF}false{color}

        System.out.println(DataTypeUtils.isDoubleTypeCompatible("13.")); --> 
{color:#FF}false{color}

        System.out.println(DataTypeUtils.isFloatTypeCompatible("13.0")); --> 
true

        

[jira] [Commented] (NIFI-5141) ValidateRecord considers a record invalid if it has an integer value and schema says double, even if strict type checking is disabled

2018-05-07 Thread Ed Berezitsky (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466738#comment-16466738
 ] 

Ed Berezitsky commented on NIFI-5141:
-

[~markap14],

this bug affected two methods: _DataTypeUtils.isFloatTypeCompatible_ and 
_DataTypeUtils.isDoubleTypeCompatible_.

Reproduced for both successfully. After applying patch both methods return 
desired results for format with numbers having digits after period (i.e. 
"13.45"). But it still doesn't return "true" for numbers that have period, but 
don't have digits after it, i.e. "13.", while Double.parseDouble("13.") parses 
such values correctly. I think, regex for double/floats should support that 
format as well.

I would recommend to add one more pattern to support that:

 
{code:java}
    private static final String  doubleRegex =
        OptionalSign +
        "(" +
            Infinity + "|" +
            NotANumber + "|"+
            "(" + Base10Digits + OptionalBase10Decimal + ")" + "|" +
            "(" + Base10Digits + "\\." + ")" + "|" +   // recommend to add this 
pattern
            "(" + Base10Digits + OptionalBase10Decimal + Base10Exponent + ")" + 
"|" +
            "(" + Base10Decimal + OptionalBase10Exponent + ")" +
        ")";
 
{code}
Before:

        System.out.println(DataTypeUtils.isFloatTypeCompatible("13")); --> 
{color:#FF}false{color}

        System.out.println(DataTypeUtils.isDoubleTypeCompatible("13")); --> 
{color:#FF}false{color}

        System.out.println(DataTypeUtils.isFloatTypeCompatible("13.")); --> 
{color:#FF}false{color}

        System.out.println(DataTypeUtils.isDoubleTypeCompatible("13.")); --> 
{color:#FF}false{color}

        System.out.println(DataTypeUtils.isFloatTypeCompatible("13.0")); --> 
true

        System.out.println(DataTypeUtils.isDoubleTypeCompatible("13.0")); --> 
true

        System.out.println(DataTypeUtils.isFloatTypeCompatible(".13")); --> true

        System.out.println(DataTypeUtils.isDoubleTypeCompatible(".13")); --> 
true

        System.out.println(DataTypeUtils.isFloatTypeCompatible(".")); --> false

        System.out.println(DataTypeUtils.isDoubleTypeCompatible(".")); --> false

After:

        System.out.println(DataTypeUtils.isFloatTypeCompatible("13")); --> 
{color:#14892c}true{color}

        System.out.println(DataTypeUtils.isDoubleTypeCompatible("13")); --> 
{color:#14892c}true{color}

        System.out.println(DataTypeUtils.isFloatTypeCompatible("13.")); --> 
{color:#d04437}false — recommended update fixes this{color}

        System.out.println(DataTypeUtils.isDoubleTypeCompatible("13.")); --> 
{color:#d04437}false — recommended update fixes this{color}

        System.out.println(DataTypeUtils.isFloatTypeCompatible("13.0")); --> 
true

        System.out.println(DataTypeUtils.isDoubleTypeCompatible("13.0")); --> 
true

        System.out.println(DataTypeUtils.isFloatTypeCompatible(".13")); --> true

        System.out.println(DataTypeUtils.isDoubleTypeCompatible(".13")); --> 
true

        System.out.println(DataTypeUtils.isFloatTypeCompatible(".")); --> false

        System.out.println(DataTypeUtils.isDoubleTypeCompatible(".")); --> false

> ValidateRecord considers a record invalid if it has an integer value and 
> schema says double, even if strict type checking is disabled
> -
>
> Key: NIFI-5141
> URL: https://issues.apache.org/jira/browse/NIFI-5141
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
>  Labels: Record, beginner, newbie, validation
> Fix For: 1.7.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-4906) Add GetHdfsFileInfo Processor

2018-05-07 Thread Ed Berezitsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-4906:

Labels: patch pull-request-available  (was: )
Attachment: gethdfsfileinfo.patch
Status: Patch Available  (was: In Progress)

> Add GetHdfsFileInfo Processor
> -
>
> Key: NIFI-4906
> URL: https://issues.apache.org/jira/browse/NIFI-4906
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: patch, pull-request-available
> Attachments: NiFi-GetHDFSFileInfo.pdf, gethdfsfileinfo.patch
>
>
> Add *GetHdfsFileInfo* Processor to be able to get stats from a file system.
> This processor should support recursive scan, getting information of 
> directories and files.
> _File-level info required_: name, path, length, modified timestamp, last 
> access timestamp, owner, group, permissions.
> _Directory-level info required_: name, path, sum of lengths of files under a 
> dir, count of files under a dir, modified timestamp, last access timestamp, 
> owner, group, permissions.
>  
> The result returned:
>  * in single flow file (in content - a json line per file/dir info);
>  * flow file per each file/dir info (in content as json obj or in set of 
> attributes by the choice).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5044) SelectHiveQL accept only one statement

2018-05-03 Thread Ed Berezitsky (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463091#comment-16463091
 ] 

Ed Berezitsky commented on NIFI-5044:
-

Looks like we are all aligned now. Thanks for your input!

> SelectHiveQL accept only one statement
> --
>
> Key: NIFI-5044
> URL: https://issues.apache.org/jira/browse/NIFI-5044
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Davide Isoardi
>Assignee: Ed Berezitsky
>Priority: Critical
>
> In [this 
> |[https://github.com/apache/nifi/commit/bbc714e73ba245de7bc32fd9958667c847101f7d]
>  ] commit claims to add support to running multiple statements both on 
> SelectHiveQL and PutHiveQL; instead, it adds only the support to PutHiveQL, 
> so SelectHiveQL still lacks this important feature. @Matt Burgess, I saw that 
> you worked on that, is there any reason for this? If not, can we support it?
> If I try to execute this query:
> {quote}set hive.vectorized.execution.enabled = false; SELECT * FROM table_name
> {quote}
> I have this error:
>  
> {quote}2018-04-05 13:35:40,572 ERROR [Timer-Driven Process Thread-146] 
> o.a.nifi.processors.hive.SelectHiveQL 
> SelectHiveQL[id=243d4c17-b1fe-14af--ee8ce15e] Unable to execute 
> HiveQL select query set hive.vectorized.execution.enabled = false; SELECT * 
> FROM table_name for 
> StandardFlowFileRecord[uuid=0e035558-07ce-473b-b0d4-ac00b8b1df93,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1522824912161-2753, 
> container=default, section=705], offset=838441, 
> length=25],offset=0,name=cliente_attributi.csv,size=25] due to 
> org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: 
> The query did not generate a result set!; routing to failure: {}
>  org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: 
> The query did not generate a result set!
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL$2.process(SelectHiveQL.java:305)
>  at 
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2529)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:275)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.lambda$onTrigger$0(SelectHiveQL.java:215)
>  at 
> org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114)
>  at 
> org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:106)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:215)
>  at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1120)
>  at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)
>  at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
>  at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: java.sql.SQLException: The query did not generate a result set!
>  at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:438)
>  at 
> org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at 
> org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL$2.process(SelectHiveQL.java:293)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5044) SelectHiveQL accept only one statement

2018-05-02 Thread Ed Berezitsky (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461241#comment-16461241
 ] 

Ed Berezitsky commented on NIFI-5044:
-

[~mattyb149],

I think we should do that regardless the attributes having EL or not.

But here are more scenarios. This processor's having "INPUT ALLOWED". So it can 
start the flow, and there won't be incoming flow files. In this case, there is 
nothing we can do, just post error into bulletin.

Now, the processor creates one OR MORE flow files (depending on amount of 
records, and "Max Rows Per Flow File" param). Flow files are being cached until 
all data is collected. Only after that all the new flow files are going to 
success relationship. In case we are failing on post queries - we either need 
to forward all to failure, or entire data set will be discarded (if we 
rollback). I can iterate over all the flowfiles with data, and add an attribute 
with error cause to each of them before sending all to failure.

Can you comment please?

Thanks.

> SelectHiveQL accept only one statement
> --
>
> Key: NIFI-5044
> URL: https://issues.apache.org/jira/browse/NIFI-5044
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Davide Isoardi
>Assignee: Ed Berezitsky
>Priority: Critical
>
> In [this 
> |[https://github.com/apache/nifi/commit/bbc714e73ba245de7bc32fd9958667c847101f7d]
>  ] commit claims to add support to running multiple statements both on 
> SelectHiveQL and PutHiveQL; instead, it adds only the support to PutHiveQL, 
> so SelectHiveQL still lacks this important feature. @Matt Burgess, I saw that 
> you worked on that, is there any reason for this? If not, can we support it?
> If I try to execute this query:
> {quote}set hive.vectorized.execution.enabled = false; SELECT * FROM table_name
> {quote}
> I have this error:
>  
> {quote}2018-04-05 13:35:40,572 ERROR [Timer-Driven Process Thread-146] 
> o.a.nifi.processors.hive.SelectHiveQL 
> SelectHiveQL[id=243d4c17-b1fe-14af--ee8ce15e] Unable to execute 
> HiveQL select query set hive.vectorized.execution.enabled = false; SELECT * 
> FROM table_name for 
> StandardFlowFileRecord[uuid=0e035558-07ce-473b-b0d4-ac00b8b1df93,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1522824912161-2753, 
> container=default, section=705], offset=838441, 
> length=25],offset=0,name=cliente_attributi.csv,size=25] due to 
> org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: 
> The query did not generate a result set!; routing to failure: {}
>  org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: 
> The query did not generate a result set!
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL$2.process(SelectHiveQL.java:305)
>  at 
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2529)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:275)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.lambda$onTrigger$0(SelectHiveQL.java:215)
>  at 
> org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114)
>  at 
> org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:106)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:215)
>  at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1120)
>  at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)
>  at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
>  at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: java.sql.SQLException: The query did not generate a result set!
>  at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:438)
>  at 
> org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at 
> org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at 
> 

[jira] [Assigned] (NIFI-5044) SelectHiveQL accept only one statement

2018-04-26 Thread Ed Berezitsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky reassigned NIFI-5044:
---

Assignee: Ed Berezitsky

> SelectHiveQL accept only one statement
> --
>
> Key: NIFI-5044
> URL: https://issues.apache.org/jira/browse/NIFI-5044
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Davide Isoardi
>Assignee: Ed Berezitsky
>Priority: Critical
>
> In [this 
> |[https://github.com/apache/nifi/commit/bbc714e73ba245de7bc32fd9958667c847101f7d]
>  ] commit claims to add support to running multiple statements both on 
> SelectHiveQL and PutHiveQL; instead, it adds only the support to PutHiveQL, 
> so SelectHiveQL still lacks this important feature. @Matt Burgess, I saw that 
> you worked on that, is there any reason for this? If not, can we support it?
> If I try to execute this query:
> {quote}set hive.vectorized.execution.enabled = false; SELECT * FROM table_name
> {quote}
> I have this error:
>  
> {quote}2018-04-05 13:35:40,572 ERROR [Timer-Driven Process Thread-146] 
> o.a.nifi.processors.hive.SelectHiveQL 
> SelectHiveQL[id=243d4c17-b1fe-14af--ee8ce15e] Unable to execute 
> HiveQL select query set hive.vectorized.execution.enabled = false; SELECT * 
> FROM table_name for 
> StandardFlowFileRecord[uuid=0e035558-07ce-473b-b0d4-ac00b8b1df93,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1522824912161-2753, 
> container=default, section=705], offset=838441, 
> length=25],offset=0,name=cliente_attributi.csv,size=25] due to 
> org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: 
> The query did not generate a result set!; routing to failure: {}
>  org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: 
> The query did not generate a result set!
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL$2.process(SelectHiveQL.java:305)
>  at 
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2529)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:275)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.lambda$onTrigger$0(SelectHiveQL.java:215)
>  at 
> org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114)
>  at 
> org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:106)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:215)
>  at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1120)
>  at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)
>  at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
>  at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: java.sql.SQLException: The query did not generate a result set!
>  at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:438)
>  at 
> org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at 
> org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL$2.process(SelectHiveQL.java:293)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5044) SelectHiveQL accept only one statement

2018-04-24 Thread Ed Berezitsky (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451516#comment-16451516
 ] 

Ed Berezitsky commented on NIFI-5044:
-

There are some assumptions needs to be made, or more properties added.

Let's assume, Pre-Query fails due to incorrect syntax, errors (like "_Error: 
Error while processing statement: Cannot modify var123 at runtime. It is not in 
list of params that are allowed to be modified at runtime 
(state=42000,code=1__)"_), etc... Should we transfer to failure, or should we 
just create an attribute with list of errors for pre- and post- queries? Or we 
can define another param for a user to decide ("On pre/post Error": 
ignore/failure).

[~pvillard], [~mattyb149], [~disoardi], your thoughts? 

Also, I see it's unassigned yet. Is anybody working on this? If not, I can take 
it.

> SelectHiveQL accept only one statement
> --
>
> Key: NIFI-5044
> URL: https://issues.apache.org/jira/browse/NIFI-5044
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Davide Isoardi
>Priority: Critical
>
> In [this 
> |[https://github.com/apache/nifi/commit/bbc714e73ba245de7bc32fd9958667c847101f7d]
>  ] commit claims to add support to running multiple statements both on 
> SelectHiveQL and PutHiveQL; instead, it adds only the support to PutHiveQL, 
> so SelectHiveQL still lacks this important feature. @Matt Burgess, I saw that 
> you worked on that, is there any reason for this? If not, can we support it?
> If I try to execute this query:
> {quote}set hive.vectorized.execution.enabled = false; SELECT * FROM table_name
> {quote}
> I have this error:
>  
> {quote}2018-04-05 13:35:40,572 ERROR [Timer-Driven Process Thread-146] 
> o.a.nifi.processors.hive.SelectHiveQL 
> SelectHiveQL[id=243d4c17-b1fe-14af--ee8ce15e] Unable to execute 
> HiveQL select query set hive.vectorized.execution.enabled = false; SELECT * 
> FROM table_name for 
> StandardFlowFileRecord[uuid=0e035558-07ce-473b-b0d4-ac00b8b1df93,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1522824912161-2753, 
> container=default, section=705], offset=838441, 
> length=25],offset=0,name=cliente_attributi.csv,size=25] due to 
> org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: 
> The query did not generate a result set!; routing to failure: {}
>  org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: 
> The query did not generate a result set!
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL$2.process(SelectHiveQL.java:305)
>  at 
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2529)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:275)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.lambda$onTrigger$0(SelectHiveQL.java:215)
>  at 
> org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114)
>  at 
> org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:106)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:215)
>  at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1120)
>  at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)
>  at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
>  at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: java.sql.SQLException: The query did not generate a result set!
>  at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:438)
>  at 
> org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at 
> org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at 
> org.apache.nifi.processors.hive.SelectHiveQL$2.process(SelectHiveQL.java:293)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-4388) Module path not honored using InvokeScriptedProcessor

2018-03-23 Thread Ed Berezitsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-4388:

Affects Version/s: 1.1.0
   1.5.0

> Module path not honored using InvokeScriptedProcessor
> -
>
> Key: NIFI-4388
> URL: https://issues.apache.org/jira/browse/NIFI-4388
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.1.0, 1.3.0, 1.5.0
>Reporter: Patrice Freydiere
>Assignee: Ed Berezitsky
>Priority: Major
>
> When specify modulepath parameter, using groovy scriptedengine, the jars are 
> not added to classpath.
> This is currently not possible to use third party library in scripting



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4388) Module path not honored using InvokeScriptedProcessor

2018-03-23 Thread Ed Berezitsky (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412024#comment-16412024
 ] 

Ed Berezitsky commented on NIFI-4388:
-

[~frett27],

The bug appears when you CHANGE module property.

You can enforce Modules to be reloaded performing these steps:
 # Change modules to desired value
 # Change engine type to another value
 # Apply changes
 # Configure processor again, and change it back to engine type "groovy".

Modules will be loaded.

I'm going to fix this bug for next version (1.7).

> Module path not honored using InvokeScriptedProcessor
> -
>
> Key: NIFI-4388
> URL: https://issues.apache.org/jira/browse/NIFI-4388
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.3.0
>Reporter: Patrice Freydiere
>Priority: Major
>
> When specify modulepath parameter, using groovy scriptedengine, the jars are 
> not added to classpath.
> This is currently not possible to use third party library in scripting



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-4388) Module path not honored using InvokeScriptedProcessor

2018-03-23 Thread Ed Berezitsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky reassigned NIFI-4388:
---

Assignee: Ed Berezitsky

> Module path not honored using InvokeScriptedProcessor
> -
>
> Key: NIFI-4388
> URL: https://issues.apache.org/jira/browse/NIFI-4388
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.3.0
>Reporter: Patrice Freydiere
>Assignee: Ed Berezitsky
>Priority: Major
>
> When specify modulepath parameter, using groovy scriptedengine, the jars are 
> not added to classpath.
> This is currently not possible to use third party library in scripting



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-4968) InvokeScriptedProcessor Crashing NiFi cluster

2018-03-13 Thread Ed Berezitsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-4968:

Attachment: NIFI-4968-log-error-fixed.rtf
NIFI-4968-log-error-bug.gz
NIFI-4968-bulletin-fixed.rtf
NIFI-4968-bulletin-bug.rtf

> InvokeScriptedProcessor Crashing NiFi cluster
> -
>
> Key: NIFI-4968
> URL: https://issues.apache.org/jira/browse/NIFI-4968
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.1.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
> Attachments: NIFI-4968-bulletin-bug.rtf, 
> NIFI-4968-bulletin-fixed.rtf, NIFI-4968-log-error-bug.gz, 
> NIFI-4968-log-error-fixed.rtf
>
>
> InvokeScriptedProcessor with Groovy Engine crashes a cluster when Groovy 
> script doesn't compile.
> Also, it prints errors into log non-stop, until deleted.
> Bug found in NiFi 1.1, but reproducible including 1.5.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-4968) InvokeScriptedProcessor Crashing NiFi cluster

2018-03-12 Thread Ed Berezitsky (JIRA)
Ed Berezitsky created NIFI-4968:
---

 Summary: InvokeScriptedProcessor Crashing NiFi cluster
 Key: NIFI-4968
 URL: https://issues.apache.org/jira/browse/NIFI-4968
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Affects Versions: 1.1.0
Reporter: Ed Berezitsky
Assignee: Ed Berezitsky


InvokeScriptedProcessor with Groovy Engine crashes a cluster when Groovy script 
doesn't compile.

Also, it prints errors into log non-stop, until deleted.

Bug found in NiFi 1.1, but reproducible including 1.5.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-4953) FetchHBaseRow filling logs with unnecessary error messages

2018-03-09 Thread Ed Berezitsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-4953:

Issue Type: Bug  (was: New Feature)

> FetchHBaseRow filling logs with unnecessary error messages
> --
>
> Key: NIFI-4953
> URL: https://issues.apache.org/jira/browse/NIFI-4953
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.5.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>
> FetchHbaseRow prints error messages into logs when rowkey is not found. Such 
> messages generate a lot of logs while unnecessary, and affect log-based 
> monitoring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-4953) FetchHBaseRow filling logs with unnecessary error messages

2018-03-09 Thread Ed Berezitsky (JIRA)
Ed Berezitsky created NIFI-4953:
---

 Summary: FetchHBaseRow filling logs with unnecessary error messages
 Key: NIFI-4953
 URL: https://issues.apache.org/jira/browse/NIFI-4953
 Project: Apache NiFi
  Issue Type: New Feature
  Components: Extensions
Affects Versions: 1.5.0
Reporter: Ed Berezitsky
Assignee: Ed Berezitsky


FetchHbaseRow prints error messages into logs when rowkey is not found. Such 
messages generate a lot of logs while unnecessary, and affect log-based 
monitoring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4906) Add GetHdfsFileInfo Processor

2018-03-03 Thread Ed Berezitsky (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384870#comment-16384870
 ] 

Ed Berezitsky commented on NIFI-4906:
-

Attached API (NIFI usage page) for the processor. Would love to hear feedback 
from community.

> Add GetHdfsFileInfo Processor
> -
>
> Key: NIFI-4906
> URL: https://issues.apache.org/jira/browse/NIFI-4906
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
> Attachments: NiFi-GetHDFSFileInfo.pdf
>
>
> Add *GetHdfsFileInfo* Processor to be able to get stats from a file system.
> This processor should support recursive scan, getting information of 
> directories and files.
> _File-level info required_: name, path, length, modified timestamp, last 
> access timestamp, owner, group, permissions.
> _Directory-level info required_: name, path, sum of lengths of files under a 
> dir, count of files under a dir, modified timestamp, last access timestamp, 
> owner, group, permissions.
>  
> The result returned:
>  * in single flow file (in content - a json line per file/dir info);
>  * flow file per each file/dir info (in content as json obj or in set of 
> attributes by the choice).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-4906) Add GetHdfsFileInfo Processor

2018-03-03 Thread Ed Berezitsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-4906:

Attachment: NiFi-GetHDFSFileInfo.pdf

> Add GetHdfsFileInfo Processor
> -
>
> Key: NIFI-4906
> URL: https://issues.apache.org/jira/browse/NIFI-4906
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
> Attachments: NiFi-GetHDFSFileInfo.pdf
>
>
> Add *GetHdfsFileInfo* Processor to be able to get stats from a file system.
> This processor should support recursive scan, getting information of 
> directories and files.
> _File-level info required_: name, path, length, modified timestamp, last 
> access timestamp, owner, group, permissions.
> _Directory-level info required_: name, path, sum of lengths of files under a 
> dir, count of files under a dir, modified timestamp, last access timestamp, 
> owner, group, permissions.
>  
> The result returned:
>  * in single flow file (in content - a json line per file/dir info);
>  * flow file per each file/dir info (in content as json obj or in set of 
> attributes by the choice).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (NIFI-4906) Add GetHdfsFileInfo Processor

2018-02-26 Thread Ed Berezitsky (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16377031#comment-16377031
 ] 

Ed Berezitsky edited comment on NIFI-4906 at 2/26/18 3:28 PM:
--

ListHDFS is stateful and doesn't support directory-level info in result set. It 
also doesn't support incoming connections. Sometimes you don't need to "get" a 
file (list + fetch), you just need to know that the file(s)/dir(s) exists or 
not and all the information related to it (size, permissions and other listed 
in description). Since HDF not always running on an edge of HDP cluster, you 
also cannot use execute script to run hdfs dfs commands. So this effort if to 
create kinda HDFS client for read-only operations (-count, -du, -ls, -test and 
some others).

I hope it makes sense.


was (Author: bdesert):
ListHDFS is stateful and doesn't support directory-level info in result set. It 
also doesn't support incoming connections. Sometimes you don't need to "get" a 
file, you just need to know that the file(s)/dir(s) exists or not and all the 
information related to it (size, permissions and other listed in description). 
Since HDF not always running on an edge of HDP cluster, you also cannot use 
execute script to run hdfs dfs commands. So this effort if to create kinda HDFS 
client for read-only operations (-count, -du, -ls, -test and some others).

I hope it makes sense.

> Add GetHdfsFileInfo Processor
> -
>
> Key: NIFI-4906
> URL: https://issues.apache.org/jira/browse/NIFI-4906
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>
> Add *GetHdfsFileInfo* Processor to be able to get stats from a file system.
> This processor should support recursive scan, getting information of 
> directories and files.
> _File-level info required_: name, path, length, modified timestamp, last 
> access timestamp, owner, group, permissions.
> _Directory-level info required_: name, path, sum of lengths of files under a 
> dir, count of files under a dir, modified timestamp, last access timestamp, 
> owner, group, permissions.
>  
> The result returned:
>  * in single flow file (in content - a json line per file/dir info);
>  * flow file per each file/dir info (in content as json obj or in set of 
> attributes by the choice).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4906) Add GetHdfsFileInfo Processor

2018-02-26 Thread Ed Berezitsky (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16377031#comment-16377031
 ] 

Ed Berezitsky commented on NIFI-4906:
-

ListHDFS is stateful and doesn't support directory-level info in result set. It 
also doesn't support incoming connections. Sometimes you don't need to "get" a 
file, you just need to know that the file(s)/dir(s) exists or not and all the 
information related to it (size, permissions and other listed in description). 
Since HDF not always running on an edge of HDP cluster, you also cannot use 
execute script to run hdfs dfs commands. So this effort if to create kinda HDFS 
client for read-only operations (-count, -du, -ls, -test and some others).

I hope it makes sense.

> Add GetHdfsFileInfo Processor
> -
>
> Key: NIFI-4906
> URL: https://issues.apache.org/jira/browse/NIFI-4906
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>
> Add *GetHdfsFileInfo* Processor to be able to get stats from a file system.
> This processor should support recursive scan, getting information of 
> directories and files.
> _File-level info required_: name, path, length, modified timestamp, last 
> access timestamp, owner, group, permissions.
> _Directory-level info required_: name, path, sum of lengths of files under a 
> dir, count of files under a dir, modified timestamp, last access timestamp, 
> owner, group, permissions.
>  
> The result returned:
>  * in single flow file (in content - a json line per file/dir info);
>  * flow file per each file/dir info (in content as json obj or in set of 
> attributes by the choice).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-4906) Add GetHdfsFileInfo Processor

2018-02-23 Thread Ed Berezitsky (JIRA)
Ed Berezitsky created NIFI-4906:
---

 Summary: Add GetHdfsFileInfo Processor
 Key: NIFI-4906
 URL: https://issues.apache.org/jira/browse/NIFI-4906
 Project: Apache NiFi
  Issue Type: New Feature
  Components: Extensions
Reporter: Ed Berezitsky
Assignee: Ed Berezitsky


Add *GetHdfsFileInfo* Processor to be able to get stats from a file system.

This processor should support recursive scan, getting information of 
directories and files.

_File-level info required_: name, path, length, modified timestamp, last access 
timestamp, owner, group, permissions.

_Directory-level info required_: name, path, sum of lengths of files under a 
dir, count of files under a dir, modified timestamp, last access timestamp, 
owner, group, permissions.

 

The result returned:
 * in single flow file (in content - a json line per file/dir info);
 * flow file per each file/dir info (in content as json obj or in set of 
attributes by the choice).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-4833) NIFI-4833 Add ScanHBase processor

2018-01-31 Thread Ed Berezitsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-4833:

Description: 
Add ScanHBase (new) processor to retrieve records from HBase tables.

Today there are GetHBase and FetchHBaseRow. GetHBase can pull entire table or 
only new rows after processor started; it also must be scheduled and doesn't 
support incoming . FetchHBaseRow can pull rows with known rowkeys only.

This processor could provide functionality similar to what could be reached by 
using hbase shell, defining following properties:

-scan based on range of row key IDs 

-scan based on range of time stamps

-limit number of records pulled

-use filters

-reverse rows

  was:
Add ScanHBase (new) processor to retrieve records from HBase tables.

Today there are GetHBase and FetchHBaseRow. GetHBase can pull entire table or 
only new rows after processor started. FetchHBaseRow can pull records with 
known rowkeys.

This processor could provide functionality similar to what could be reached by 
using hbase shell, defining following properties:

-scan based on range of row key IDs 

-scan based on range of time stamps

-limit number of records pulled

-use filters

-reverse rows


> NIFI-4833 Add ScanHBase processor
> -
>
> Key: NIFI-4833
> URL: https://issues.apache.org/jira/browse/NIFI-4833
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>
> Add ScanHBase (new) processor to retrieve records from HBase tables.
> Today there are GetHBase and FetchHBaseRow. GetHBase can pull entire table or 
> only new rows after processor started; it also must be scheduled and doesn't 
> support incoming . FetchHBaseRow can pull rows with known rowkeys only.
> This processor could provide functionality similar to what could be reached 
> by using hbase shell, defining following properties:
> -scan based on range of row key IDs 
> -scan based on range of time stamps
> -limit number of records pulled
> -use filters
> -reverse rows



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-4833) NIFI-4833 Add ScanHBase processor

2018-01-31 Thread Ed Berezitsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-4833:

Description: 
Add ScanHBase (new) processor to retrieve records from HBase tables.

Today there are GetHBase and FetchHBaseRow. GetHBase can pull entire table or 
only new rows after processor started. FetchHBaseRow can pull records with 
known rowkeys.

This processor could provide functionality similar to what could be reached by 
using hbase shell, defining following properties:

-scan based on range of row key IDs 

-scan based on range of time stamps

-limit number of records pulled

-use filters

-reverse rows

  was:
Add ScanHBase (new) processor to retrieve records from HBase tables.

This processor could have following features:

-scan based on range of row key IDs 

-scan based on range of time stamps

-limit number of records pulled

-use filters


> NIFI-4833 Add ScanHBase processor
> -
>
> Key: NIFI-4833
> URL: https://issues.apache.org/jira/browse/NIFI-4833
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>
> Add ScanHBase (new) processor to retrieve records from HBase tables.
> Today there are GetHBase and FetchHBaseRow. GetHBase can pull entire table or 
> only new rows after processor started. FetchHBaseRow can pull records with 
> known rowkeys.
> This processor could provide functionality similar to what could be reached 
> by using hbase shell, defining following properties:
> -scan based on range of row key IDs 
> -scan based on range of time stamps
> -limit number of records pulled
> -use filters
> -reverse rows



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-4833) NIFI-4833 Add ScanHBase processor

2018-01-31 Thread Ed Berezitsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-4833:

Summary: NIFI-4833 Add ScanHBase processor  (was: Add ScanHBase processor)

> NIFI-4833 Add ScanHBase processor
> -
>
> Key: NIFI-4833
> URL: https://issues.apache.org/jira/browse/NIFI-4833
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>
> Add ScanHBase (new) processor to retrieve records from HBase tables.
> This processor could have following features:
> -scan based on range of row key IDs 
> -scan based on range of time stamps
> -limit number of records pulled
> -use filters



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-4833) Add ScanHBase processor

2018-01-31 Thread Ed Berezitsky (JIRA)
Ed Berezitsky created NIFI-4833:
---

 Summary: Add ScanHBase processor
 Key: NIFI-4833
 URL: https://issues.apache.org/jira/browse/NIFI-4833
 Project: Apache NiFi
  Issue Type: New Feature
  Components: Extensions
Reporter: Ed Berezitsky
Assignee: Ed Berezitsky


Add ScanHBase (new) processor to retrieve records from HBase tables.

This processor could have following features:

-scan based on range of row key IDs 

-scan based on range of time stamps

-limit number of records pulled

-use filters



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)