Yet this works?.... string_binary(byte_substr(`data`, 1, 80))
On Tue, Jul 17, 2018 at 3:12 PM, John Omernik <[email protected]> wrote: > So on B. I found the BYTE_SUBSTR and only send 200 bytes to the > string_binary function, I still get an error. Something else is happening > here... > > select `type`, `timestamp`, `src_ip`, `dst_ip`, `src_port`, `dst_port`, > string_binary(byte_substr(`data`, 1, 200)) as mydata from > `user/jomernik/bf2_7306.pcap` limit 10 > > I get the same > > Error Id: 213075e7-378a-437f-a5dc-408326f123f3 on > zeta3.brewingintel.com:20005] > > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IndexOutOfBoundsException: index: 0, length: 379 (expected: range(0, 256)) > > > > > > On Tue, Jul 17, 2018 at 12:56 PM, John Omernik <[email protected]> wrote: > >> >> Thanks Vlad a couple of thoughts. >> >> >> A. I think that should be fixed. That seems like a limitation that is >> both unexpected and undocumented. >> >> B. Is there a way, if my data in the table is returned as binary to >> start with, for me to return the first 256 bytes? I tried substring, and >> tries to force to UTF-8 and I am getting some issues there. >> >> On Tue, Jul 17, 2018 at 10:33 AM, Vlad Rozov <[email protected]> wrote: >> >>> In case of DRILL-6607 the issue lies in the implementation of >>> "string_binary" function: it is not prepared to handle incoming data that >>> when converted to a binary string would exceed 256 bytes as it does not >>> reallocate the output buffer. Until the function code is fixed, the only >>> way to avoid the error is either not to use "string_binary" or to use it >>> with the data that meets "string_binary" limitation. >>> >>> Thank you, >>> >>> Vlad >>> >>> >>> On 7/13/18 14:01, Ted Dunning wrote: >>> >>>> There are bounds for acceptable behavior for a function like this. >>>> Array >>>> index out of bounds is not acceptable. Aborting with a clean message >>>> about >>>> to true problem might be fine, as would be to return a null. >>>> >>>> On Fri, Jul 13, 2018, 13:46 John Omernik <[email protected]> wrote: >>>> >>>> So, as to the actual problem, I opened a JIRA here: >>>>> >>>>> https://issues.apache.org/jira/browse/DRILL-6607 >>>>> >>>>> The reason I brought this here is my own curiosity: Does an issue in >>>>> using >>>>> this function most likely lie in the function code itself not handling >>>>> good >>>>> data, or is the issue in the pcap pluglin which produces the data for >>>>> this >>>>> function to consume, I am just curious on how something like this >>>>> could be >>>>> avoided. >>>>> >>>>> John >>>>> >>>>> >>> >> >
