[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types

2016-08-24 Thread mattyb149
Github user mattyb149 commented on the issue:

https://github.com/apache/nifi/pull/883
  
+1 LGTM, tested all the logic. Thanks for the great contribution! Merging 
to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types

2016-08-24 Thread patricker
Github user patricker commented on the issue:

https://github.com/apache/nifi/pull/883
  
Did you need me to change something still, or are you doing more testing?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types

2016-08-24 Thread mattyb149
Github user mattyb149 commented on the issue:

https://github.com/apache/nifi/pull/883
  
Sounds good. Yeah I used the hex and base64 stuff with no issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types

2016-08-24 Thread patricker
Github user patricker commented on the issue:

https://github.com/apache/nifi/pull/883
  
You absolutely could put your Smilies in to a binary field, but _before_ 
you send them to PutSQL you need to convert them to Binary using something like 
string.getBytes("UTF-8").  At that point they become plain bytes and they 
should load without issue.

To get the flow file into Binary I would use the Base64EncodeContent 
processor first, then use RelaceText to move it to an attribute, and finally 
push it using PutSQL with the 'base64' option.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types

2016-08-24 Thread mattyb149
Github user mattyb149 commented on the issue:

https://github.com/apache/nifi/pull/883
  
That part makes sense, but what about when I put three emoticons in the 
UpdateAttribute processor to set the value, then I get three bytes in the 
table, all value 63. I would've expected 6 bytes, two per smiley as a Unicode 
or UTF-16 value. I'll check the flow file to see if that's being done before 
PutSQL.
Mostly I was testing other ways of getting binary data into the processor 
vs Avro, I don't have an Avro file with binary data in it. I'll create one and 
take it through the paces. Also I wonder what happens when you have binary data 
as the payload of a flow file and want to use that in PutSQL, using ExtractText 
to get it into an attribute.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types

2016-08-24 Thread patricker
Github user patricker commented on the issue:

https://github.com/apache/nifi/pull/883
  
Matt,  'ascii' doesn't quite mean what it apperas :smile: .

If you convert an array of random bytes to UTF8 and then back to bytes you 
will find you have considerably more bytes then you started with.  This is 
because in order to represent certain bytes as actual characters UTF8 has to 
insert extra marker bytes; then when you convert back to bytes these extra 
bytes come back too.  I've see the conversion of binary data -> UTF8 -> binary 
data grow by 40%.

The key thing to remember for this processor is that the data coming in 
only looks like text because it is contained in an attribute, it's not actually 
text, it's raw bytes. In this processor 'ascii' means that each character is 
represents a single byte; calling string.getBytes("ASCII") is just a handy 
shortcut in Java to get this functionality.

I can rename it to 'onecharperbyte' if that makes more sense.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types

2016-08-22 Thread patricker
Github user patricker commented on the issue:

https://github.com/apache/nifi/pull/883
  
OK, good to go.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types

2016-08-22 Thread patricker
Github user patricker commented on the issue:

https://github.com/apache/nifi/pull/883
  
Hold on that update.. looks like I got my branches crossed.  I'll get it 
cleaned up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types

2016-08-19 Thread mattyb149
Github user mattyb149 commented on the issue:

https://github.com/apache/nifi/pull/883
  
That sounds good. We should move the discussion into an email thread on the 
users or dev list?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types

2016-08-19 Thread patricker
Github user patricker commented on the issue:

https://github.com/apache/nifi/pull/883
  
I was thinking about this and I had a new idea for a solution.  What if I 
added code that tried to read a new, optional, attribute of the format 
`sql.args.N.format`.  For the moment it would be just for binary data, with 
values like 'hex' or 'ascii' (something like that).  But it could be expanded 
down the road to also support things like a more flexible version of my 
Timestamp PR so users could optionally provide their own Timestamp format.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types

2016-08-18 Thread patricker
Github user patricker commented on the issue:

https://github.com/apache/nifi/pull/883
  
I think that supporting both would be nice, but there is a one in a million 
chance that someone will provide an Avro binary string that matches the hex 
validation checks causing data corruption... I'm OK with going with just the 
Avro format for now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types

2016-08-18 Thread mattyb149
Github user mattyb149 commented on the issue:

https://github.com/apache/nifi/pull/883
  
A complex problem for sure. Certainly you are right with the Jira that 
Binary types are currently not supported, and if the use case you describe 
(coming from Avro) is the most common, then this is clearly an improvement :)  
Perhaps we could add this improvement and then write a Jira to handle the 
general case. Checking for a prefix is a decent approach but nothing will be 
bulletproof, such is the nature of being able to describe the type of the data 
vs being able to represent it when it must exist as a String (for attribute 
values). The workaround for byte literals in the interim could be to put them 
directly in the SQL statement (instead of attribute values).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types

2016-08-18 Thread patricker
Github user patricker commented on the issue:

https://github.com/apache/nifi/pull/883
  
Integer.decode won't work for the case I described, where the value is 
coming from an upstream Avro file. I could put in a check to see if the string 
starts with 0x and contains only hex characters, and if so parse it as you 
suggested.  Thoughts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---