A. +1.

B. Every byte in a binary data may require up to 4 bytes (0xXX) in the string representation, so 80 may work, 60 should reliably work.

Thank you,

Vlad

On 7/17/18 13:14, John Omernik wrote:
Yet this works?....

string_binary(byte_substr(`data`, 1, 80))

On Tue, Jul 17, 2018 at 3:12 PM, John Omernik <j...@omernik.com> wrote:

So on B. I found the BYTE_SUBSTR and only send 200 bytes to the
string_binary function, I still get an error.  Something else is happening
here...

select `type`, `timestamp`, `src_ip`, `dst_ip`, `src_port`, `dst_port`,
string_binary(byte_substr(`data`, 1, 200)) as mydata from
`user/jomernik/bf2_7306.pcap` limit 10

I get the same

Error Id: 213075e7-378a-437f-a5dc-408326f123f3 on
zeta3.brewingintel.com:20005]

org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:
IndexOutOfBoundsException: index: 0, length: 379 (expected: range(0, 256))





On Tue, Jul 17, 2018 at 12:56 PM, John Omernik <j...@omernik.com> wrote:

Thanks Vlad a couple of thoughts.


A. I think that should be fixed. That seems like a limitation that is
both unexpected and undocumented.

B.  Is there a way, if my data in the table is returned as binary to
start with, for me to return the first 256 bytes? I tried substring, and
tries to force to UTF-8 and I am getting some issues there.

On Tue, Jul 17, 2018 at 10:33 AM, Vlad Rozov <vro...@apache.org> wrote:

In case of DRILL-6607 the issue lies in the implementation of
"string_binary" function: it is not prepared to handle incoming data that
when converted to a binary string would exceed 256 bytes as it does not
reallocate the output buffer. Until the function code is fixed, the only
way to avoid the error is either not to use "string_binary" or to use it
with the data that meets "string_binary" limitation.

Thank you,

Vlad


On 7/13/18 14:01, Ted Dunning wrote:

There are bounds for acceptable behavior for a function like this.
Array
index out of bounds is not acceptable. Aborting with a clean message
about
to true problem might be fine, as would be to return a null.

On Fri, Jul 13, 2018, 13:46 John Omernik <j...@omernik.com> wrote:

So, as to the actual problem, I opened a JIRA here:
https://issues.apache.org/jira/browse/DRILL-6607

The reason I brought this here is my own curiosity:  Does an issue in
using
this function most likely lie in the function code itself not handling
good
data, or is the issue in the pcap pluglin which produces the data for
this
function to consume, I am just curious on how something like this
could be
avoided.

John



Reply via email to