[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types
Github user mattyb149 commented on the issue: https://github.com/apache/nifi/pull/883 +1 LGTM, tested all the logic. Thanks for the great contribution! Merging to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types
Github user patricker commented on the issue: https://github.com/apache/nifi/pull/883 Did you need me to change something still, or are you doing more testing? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types
Github user mattyb149 commented on the issue: https://github.com/apache/nifi/pull/883 Sounds good. Yeah I used the hex and base64 stuff with no issues. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types
Github user patricker commented on the issue: https://github.com/apache/nifi/pull/883 You absolutely could put your Smilies in to a binary field, but _before_ you send them to PutSQL you need to convert them to Binary using something like string.getBytes("UTF-8"). At that point they become plain bytes and they should load without issue. To get the flow file into Binary I would use the Base64EncodeContent processor first, then use RelaceText to move it to an attribute, and finally push it using PutSQL with the 'base64' option. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types
Github user mattyb149 commented on the issue: https://github.com/apache/nifi/pull/883 That part makes sense, but what about when I put three emoticons in the UpdateAttribute processor to set the value, then I get three bytes in the table, all value 63. I would've expected 6 bytes, two per smiley as a Unicode or UTF-16 value. I'll check the flow file to see if that's being done before PutSQL. Mostly I was testing other ways of getting binary data into the processor vs Avro, I don't have an Avro file with binary data in it. I'll create one and take it through the paces. Also I wonder what happens when you have binary data as the payload of a flow file and want to use that in PutSQL, using ExtractText to get it into an attribute. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types
Github user patricker commented on the issue: https://github.com/apache/nifi/pull/883 Matt, 'ascii' doesn't quite mean what it apperas :smile: . If you convert an array of random bytes to UTF8 and then back to bytes you will find you have considerably more bytes then you started with. This is because in order to represent certain bytes as actual characters UTF8 has to insert extra marker bytes; then when you convert back to bytes these extra bytes come back too. I've see the conversion of binary data -> UTF8 -> binary data grow by 40%. The key thing to remember for this processor is that the data coming in only looks like text because it is contained in an attribute, it's not actually text, it's raw bytes. In this processor 'ascii' means that each character is represents a single byte; calling string.getBytes("ASCII") is just a handy shortcut in Java to get this functionality. I can rename it to 'onecharperbyte' if that makes more sense. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types
Github user patricker commented on the issue: https://github.com/apache/nifi/pull/883 OK, good to go. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types
Github user patricker commented on the issue: https://github.com/apache/nifi/pull/883 Hold on that update.. looks like I got my branches crossed. I'll get it cleaned up. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types
Github user mattyb149 commented on the issue: https://github.com/apache/nifi/pull/883 That sounds good. We should move the discussion into an email thread on the users or dev list? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types
Github user patricker commented on the issue: https://github.com/apache/nifi/pull/883 I was thinking about this and I had a new idea for a solution. What if I added code that tried to read a new, optional, attribute of the format `sql.args.N.format`. For the moment it would be just for binary data, with values like 'hex' or 'ascii' (something like that). But it could be expanded down the road to also support things like a more flexible version of my Timestamp PR so users could optionally provide their own Timestamp format. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types
Github user patricker commented on the issue: https://github.com/apache/nifi/pull/883 I think that supporting both would be nice, but there is a one in a million chance that someone will provide an Avro binary string that matches the hex validation checks causing data corruption... I'm OK with going with just the Avro format for now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types
Github user mattyb149 commented on the issue: https://github.com/apache/nifi/pull/883 A complex problem for sure. Certainly you are right with the Jira that Binary types are currently not supported, and if the use case you describe (coming from Avro) is the most common, then this is clearly an improvement :) Perhaps we could add this improvement and then write a Jira to handle the general case. Checking for a prefix is a decent approach but nothing will be bulletproof, such is the nature of being able to describe the type of the data vs being able to represent it when it must exist as a String (for attribute values). The workaround for byte literals in the interim could be to put them directly in the SQL statement (instead of attribute values). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] nifi issue #883: NIFI-2591 - PutSQL has no handling for Binary data types
Github user patricker commented on the issue: https://github.com/apache/nifi/pull/883 Integer.decode won't work for the case I described, where the value is coming from an upstream Avro file. I could put in a check to see if the string starts with 0x and contains only hex characters, and if so parse it as you suggested. Thoughts? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---