Yes, it is limit on the output: "when converted to a binary string would
exceed 256 bytes as it does not reallocate the output buffer".
Java does not have optional arguments (it is strongly typed language).
Java has overloaded functions :).
I'd suggest to file another JIRA. DRILL-6607 is a bug. A new JIRA is a
request for new functionality.
Thank you,
Vlad
On 7/18/18 05:15, John Omernik wrote:
Interesting, so the 256 limit is on output, not on input?
Is Drill-6607 enough to track this? If so, i have one more "feature" to
add to it, not sure if I should include it on 6607 or create a new JIRA and
link them. Basically, I'd like the ability to pass an int to the function.
(Does Java have optional arguments? It must because of substr(data, start)
and substr(data, start, nochars)) Basically string_binary(data) works as
intended (with the limitation fixed) and string_binary(data, 1) would work
on the binary, but replace EVERY character with the hex representation.
And optional third would be to do a format string of some sort so the user
could pick output, but I like the idea of having every character as hex for
analysis.
John
On Tue, Jul 17, 2018 at 3:40 PM, Vlad Rozov <[email protected]> wrote:
A. +1.
B. Every byte in a binary data may require up to 4 bytes (0xXX) in the
string representation, so 80 may work, 60 should reliably work.
Thank you,
Vlad
On 7/17/18 13:14, John Omernik wrote:
Yet this works?....
string_binary(byte_substr(`data`, 1, 80))
On Tue, Jul 17, 2018 at 3:12 PM, John Omernik <[email protected]> wrote:
So on B. I found the BYTE_SUBSTR and only send 200 bytes to the
string_binary function, I still get an error. Something else is
happening
here...
select `type`, `timestamp`, `src_ip`, `dst_ip`, `src_port`, `dst_port`,
string_binary(byte_substr(`data`, 1, 200)) as mydata from
`user/jomernik/bf2_7306.pcap` limit 10
I get the same
Error Id: 213075e7-378a-437f-a5dc-408326f123f3 on
zeta3.brewingintel.com:20005]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:
IndexOutOfBoundsException: index: 0, length: 379 (expected: range(0,
256))
On Tue, Jul 17, 2018 at 12:56 PM, John Omernik <[email protected]> wrote:
Thanks Vlad a couple of thoughts.
A. I think that should be fixed. That seems like a limitation that is
both unexpected and undocumented.
B. Is there a way, if my data in the table is returned as binary to
start with, for me to return the first 256 bytes? I tried substring, and
tries to force to UTF-8 and I am getting some issues there.
On Tue, Jul 17, 2018 at 10:33 AM, Vlad Rozov <[email protected]> wrote:
In case of DRILL-6607 the issue lies in the implementation of
"string_binary" function: it is not prepared to handle incoming data
that
when converted to a binary string would exceed 256 bytes as it does not
reallocate the output buffer. Until the function code is fixed, the
only
way to avoid the error is either not to use "string_binary" or to use
it
with the data that meets "string_binary" limitation.
Thank you,
Vlad
On 7/13/18 14:01, Ted Dunning wrote:
There are bounds for acceptable behavior for a function like this.
Array
index out of bounds is not acceptable. Aborting with a clean message
about
to true problem might be fine, as would be to return a null.
On Fri, Jul 13, 2018, 13:46 John Omernik <[email protected]> wrote:
So, as to the actual problem, I opened a JIRA here:
https://issues.apache.org/jira/browse/DRILL-6607
The reason I brought this here is my own curiosity: Does an issue in
using
this function most likely lie in the function code itself not
handling
good
data, or is the issue in the pcap pluglin which produces the data for
this
function to consume, I am just curious on how something like this
could be
avoided.
John