RE: Not Able To Access Drill Web Console In Multiple tabs Using Drill 1.13
Thanks For your response Kunal. I tried in both the way but the outcome is same the window where I submitted the query is loading and other one is getting stuck. But the same query is working well with 1.12 version in which I could monitor the query status and submit another query. I am submitting my query and trying to observe it's progress using the web-console of same drillbit as in our cluster we have only one web-console that we can access. Thanks for your suggestion regarding the other tools which is one option that I could try. Please let me know your response. Best regards, _ Tilak -Original Message- From: Kunal Khatua [mailto:ku...@apache.org] Sent: Tuesday, July 17, 2018 8:36 PM To: user@drill.apache.org Subject: Re: Not Able To Access Drill Web Console In Multiple tabs Using Drill 1.13 You could try the reverse. Monitor in the initial window, while submitting the query in another window. That said, the reason your console is getting stuck is by design. The browser tab from which you submit the query is the window where you'll receive the results of the query. Hence, the window is "stuck" as it is waiting for results to come back. With regards to why you are not able to monitor the current query status is because you might be having a fairly large result-set that the server is formatting for the web-console, resulting in the WebServer threads being saturated. A simpler workaround is to monitor the system through a second Drillbit's web-console. If the first Drillbit (from which you launched the query) is very busy, you'll see the status updates not coming in as frequently. As a thumb rule, use the WebConsole for quick exploration (i.e. experimental queries with LIMIT to just glance at the data). Otherwise, there are a number of good JDBC based tools like SQuirrel and DBeaver (the latter also downloads the drivers automatically), that you can use. On 7/17/2018 3:40:06 AM, Surneni Tilak wrote: Hi Team, I am using Drill 1.13.0 version. I am facing below issues which were not there in 1.12.0 1. When I am submitting query I am not able to open Drill web-console in another window to monitor the currently running query status. 2. Not able to submit another query once a query is under running status as the console is getting stuck in running the first query. Please guide me how I could come out of these issues as I would like to use the latest version of Drill. Best regards, _ Tilak
Re: Array Index Out of Bounds in String Binary
A. +1. B. Every byte in a binary data may require up to 4 bytes (0xXX) in the string representation, so 80 may work, 60 should reliably work. Thank you, Vlad On 7/17/18 13:14, John Omernik wrote: Yet this works? string_binary(byte_substr(`data`, 1, 80)) On Tue, Jul 17, 2018 at 3:12 PM, John Omernik wrote: So on B. I found the BYTE_SUBSTR and only send 200 bytes to the string_binary function, I still get an error. Something else is happening here... select `type`, `timestamp`, `src_ip`, `dst_ip`, `src_port`, `dst_port`, string_binary(byte_substr(`data`, 1, 200)) as mydata from `user/jomernik/bf2_7306.pcap` limit 10 I get the same Error Id: 213075e7-378a-437f-a5dc-408326f123f3 on zeta3.brewingintel.com:20005] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IndexOutOfBoundsException: index: 0, length: 379 (expected: range(0, 256)) On Tue, Jul 17, 2018 at 12:56 PM, John Omernik wrote: Thanks Vlad a couple of thoughts. A. I think that should be fixed. That seems like a limitation that is both unexpected and undocumented. B. Is there a way, if my data in the table is returned as binary to start with, for me to return the first 256 bytes? I tried substring, and tries to force to UTF-8 and I am getting some issues there. On Tue, Jul 17, 2018 at 10:33 AM, Vlad Rozov wrote: In case of DRILL-6607 the issue lies in the implementation of "string_binary" function: it is not prepared to handle incoming data that when converted to a binary string would exceed 256 bytes as it does not reallocate the output buffer. Until the function code is fixed, the only way to avoid the error is either not to use "string_binary" or to use it with the data that meets "string_binary" limitation. Thank you, Vlad On 7/13/18 14:01, Ted Dunning wrote: There are bounds for acceptable behavior for a function like this. Array index out of bounds is not acceptable. Aborting with a clean message about to true problem might be fine, as would be to return a null. On Fri, Jul 13, 2018, 13:46 John Omernik wrote: So, as to the actual problem, I opened a JIRA here: https://issues.apache.org/jira/browse/DRILL-6607 The reason I brought this here is my own curiosity: Does an issue in using this function most likely lie in the function code itself not handling good data, or is the issue in the pcap pluglin which produces the data for this function to consume, I am just curious on how something like this could be avoided. John
Re: Array Index Out of Bounds in String Binary
Yet this works? string_binary(byte_substr(`data`, 1, 80)) On Tue, Jul 17, 2018 at 3:12 PM, John Omernik wrote: > So on B. I found the BYTE_SUBSTR and only send 200 bytes to the > string_binary function, I still get an error. Something else is happening > here... > > select `type`, `timestamp`, `src_ip`, `dst_ip`, `src_port`, `dst_port`, > string_binary(byte_substr(`data`, 1, 200)) as mydata from > `user/jomernik/bf2_7306.pcap` limit 10 > > I get the same > > Error Id: 213075e7-378a-437f-a5dc-408326f123f3 on > zeta3.brewingintel.com:20005] > > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IndexOutOfBoundsException: index: 0, length: 379 (expected: range(0, 256)) > > > > > > On Tue, Jul 17, 2018 at 12:56 PM, John Omernik wrote: > >> >> Thanks Vlad a couple of thoughts. >> >> >> A. I think that should be fixed. That seems like a limitation that is >> both unexpected and undocumented. >> >> B. Is there a way, if my data in the table is returned as binary to >> start with, for me to return the first 256 bytes? I tried substring, and >> tries to force to UTF-8 and I am getting some issues there. >> >> On Tue, Jul 17, 2018 at 10:33 AM, Vlad Rozov wrote: >> >>> In case of DRILL-6607 the issue lies in the implementation of >>> "string_binary" function: it is not prepared to handle incoming data that >>> when converted to a binary string would exceed 256 bytes as it does not >>> reallocate the output buffer. Until the function code is fixed, the only >>> way to avoid the error is either not to use "string_binary" or to use it >>> with the data that meets "string_binary" limitation. >>> >>> Thank you, >>> >>> Vlad >>> >>> >>> On 7/13/18 14:01, Ted Dunning wrote: >>> There are bounds for acceptable behavior for a function like this. Array index out of bounds is not acceptable. Aborting with a clean message about to true problem might be fine, as would be to return a null. On Fri, Jul 13, 2018, 13:46 John Omernik wrote: So, as to the actual problem, I opened a JIRA here: > > https://issues.apache.org/jira/browse/DRILL-6607 > > The reason I brought this here is my own curiosity: Does an issue in > using > this function most likely lie in the function code itself not handling > good > data, or is the issue in the pcap pluglin which produces the data for > this > function to consume, I am just curious on how something like this > could be > avoided. > > John > > >>> >> >
Re: Array Index Out of Bounds in String Binary
So on B. I found the BYTE_SUBSTR and only send 200 bytes to the string_binary function, I still get an error. Something else is happening here... select `type`, `timestamp`, `src_ip`, `dst_ip`, `src_port`, `dst_port`, string_binary(byte_substr(`data`, 1, 200)) as mydata from `user/jomernik/bf2_7306.pcap` limit 10 I get the same Error Id: 213075e7-378a-437f-a5dc-408326f123f3 on zeta3.brewingintel.com:20005] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IndexOutOfBoundsException: index: 0, length: 379 (expected: range(0, 256)) On Tue, Jul 17, 2018 at 12:56 PM, John Omernik wrote: > > Thanks Vlad a couple of thoughts. > > > A. I think that should be fixed. That seems like a limitation that is both > unexpected and undocumented. > > B. Is there a way, if my data in the table is returned as binary to start > with, for me to return the first 256 bytes? I tried substring, and tries to > force to UTF-8 and I am getting some issues there. > > On Tue, Jul 17, 2018 at 10:33 AM, Vlad Rozov wrote: > >> In case of DRILL-6607 the issue lies in the implementation of >> "string_binary" function: it is not prepared to handle incoming data that >> when converted to a binary string would exceed 256 bytes as it does not >> reallocate the output buffer. Until the function code is fixed, the only >> way to avoid the error is either not to use "string_binary" or to use it >> with the data that meets "string_binary" limitation. >> >> Thank you, >> >> Vlad >> >> >> On 7/13/18 14:01, Ted Dunning wrote: >> >>> There are bounds for acceptable behavior for a function like this. Array >>> index out of bounds is not acceptable. Aborting with a clean message >>> about >>> to true problem might be fine, as would be to return a null. >>> >>> On Fri, Jul 13, 2018, 13:46 John Omernik wrote: >>> >>> So, as to the actual problem, I opened a JIRA here: https://issues.apache.org/jira/browse/DRILL-6607 The reason I brought this here is my own curiosity: Does an issue in using this function most likely lie in the function code itself not handling good data, or is the issue in the pcap pluglin which produces the data for this function to consume, I am just curious on how something like this could be avoided. John >> >
Re: Not Able To Access Drill Web Console In Multiple tabs Using Drill 1.13
You could try the reverse. Monitor in the initial window, while submitting the query in another window. That said, the reason your console is getting stuck is by design. The browser tab from which you submit the query is the window where you'll receive the results of the query. Hence, the window is "stuck" as it is waiting for results to come back. With regards to why you are not able to monitor the current query status is because you might be having a fairly large result-set that the server is formatting for the web-console, resulting in the WebServer threads being saturated. A simpler workaround is to monitor the system through a second Drillbit's web-console. If the first Drillbit (from which you launched the query) is very busy, you'll see the status updates not coming in as frequently. As a thumb rule, use the WebConsole for quick exploration (i.e. experimental queries with LIMIT to just glance at the data). Otherwise, there are a number of good JDBC based tools like SQuirrel and DBeaver (the latter also downloads the drivers automatically), that you can use. On 7/17/2018 3:40:06 AM, Surneni Tilak wrote: Hi Team, I am using Drill 1.13.0 version. I am facing below issues which were not there in 1.12.0 1. When I am submitting query I am not able to open Drill web-console in another window to monitor the currently running query status. 2. Not able to submit another query once a query is under running status as the console is getting stuck in running the first query. Please guide me how I could come out of these issues as I would like to use the latest version of Drill. Best regards, _ Tilak
Re: Array Index Out of Bounds in String Binary
Thanks Vlad a couple of thoughts. A. I think that should be fixed. That seems like a limitation that is both unexpected and undocumented. B. Is there a way, if my data in the table is returned as binary to start with, for me to return the first 256 bytes? I tried substring, and tries to force to UTF-8 and I am getting some issues there. On Tue, Jul 17, 2018 at 10:33 AM, Vlad Rozov wrote: > In case of DRILL-6607 the issue lies in the implementation of > "string_binary" function: it is not prepared to handle incoming data that > when converted to a binary string would exceed 256 bytes as it does not > reallocate the output buffer. Until the function code is fixed, the only > way to avoid the error is either not to use "string_binary" or to use it > with the data that meets "string_binary" limitation. > > Thank you, > > Vlad > > > On 7/13/18 14:01, Ted Dunning wrote: > >> There are bounds for acceptable behavior for a function like this. Array >> index out of bounds is not acceptable. Aborting with a clean message about >> to true problem might be fine, as would be to return a null. >> >> On Fri, Jul 13, 2018, 13:46 John Omernik wrote: >> >> So, as to the actual problem, I opened a JIRA here: >>> >>> https://issues.apache.org/jira/browse/DRILL-6607 >>> >>> The reason I brought this here is my own curiosity: Does an issue in >>> using >>> this function most likely lie in the function code itself not handling >>> good >>> data, or is the issue in the pcap pluglin which produces the data for >>> this >>> function to consume, I am just curious on how something like this could >>> be >>> avoided. >>> >>> John >>> >>> >
Re: Array Index Out of Bounds in String Binary
In case of DRILL-6607 the issue lies in the implementation of "string_binary" function: it is not prepared to handle incoming data that when converted to a binary string would exceed 256 bytes as it does not reallocate the output buffer. Until the function code is fixed, the only way to avoid the error is either not to use "string_binary" or to use it with the data that meets "string_binary" limitation. Thank you, Vlad On 7/13/18 14:01, Ted Dunning wrote: There are bounds for acceptable behavior for a function like this. Array index out of bounds is not acceptable. Aborting with a clean message about to true problem might be fine, as would be to return a null. On Fri, Jul 13, 2018, 13:46 John Omernik wrote: So, as to the actual problem, I opened a JIRA here: https://issues.apache.org/jira/browse/DRILL-6607 The reason I brought this here is my own curiosity: Does an issue in using this function most likely lie in the function code itself not handling good data, or is the issue in the pcap pluglin which produces the data for this function to consume, I am just curious on how something like this could be avoided. John
Re: CT from parquet to CSV seems to not properly encode to UTF8
Hey guys, Adding this JVM flag to the drill-env.sh file made it to work. export JAVA_TOOL_OPTIONS="-Dfile.encoding=UTF8" Thank you very much. On Tue, Jul 17, 2018 at 1:49 AM, Kunal Khatua wrote: > Hi Carlos > > It looks similar to an issue reported previously: > https://lists.apache.org/thread.html/1f3d4c427690c06f1992bc5070f355 > 689ccc5b1ed8cc3678ad8e9106@ > > Could you try setting the JVM's file encoding to UTF-8 and retry? If it > does not work, please file a JIRA in https://issues.apache.org > > Thanks > Kunal > On 7/16/2018 1:25:45 PM, Carlos Derich wrote: > It seems to be an issue only with CSV/TSV files. > > Tried writing the output as JSON and it handles the encoding properly. > > alter session set `store.format`='json' > create table dfs.tmp.test3 as select `city` from dfs.parquets.`file` > > Returns: > > {"city": "Montréal"} > > > additional info: > > parquet-tools schema: > > message root { > optional binary city (UTF8); > } > > > On Mon, Jul 16, 2018 at 2:49 PM, Carlos Derich > wrote: > > > Hello guys, hope everyone is well. > > > > I am having an encoding issue when converting a table from parquet into > > csv files, I wonder if someone could shed some light on it ? > > > > One of my data sets has data in French with lots of accentuation, and it > > is persisted in HDFS as parquet. > > > > > > When I query the parquet table with: *select `city` from > > dfs.parquets.`file` , *it properly return the data encoded. > > > > > > *city* > > > > *Montréal* > > > > > > Then I convert this table into a CSV file with the following query: > > > > *alter session set `store.format`='csv'* > > *create table dfs.csvs.`converted` as select * from dfs.parquets.`file`* > > > > > > Then when I run a select query on it, it returns data not properly > encoded: > > > > *select columns[0] from dfs.csvs.`converted`* > > > > Returns: > > > > *Montr?al* > > > > > > My storage plugin is pretty standard: > > > > "csv" : { > > "type" : "text", > > "extensions" : [ "csv" ], > > "delimiter" : ",", > > "skipFirstLine": true > > }, > > > > Should I explicitly add an charset option somewhere ? Couldn't find > > anything helpful on the docs. > > > > Tried adding *export DRILL_JAVA_OPTS="$DRILL_JAVA_OPTS > > -Dsaffron.default.charset=UTF-8"* to drill-env.sh file, but no luck. > > > > Have anyone ran into similar issues ? > > > > Thank you ! > > >
Not Able To Access Drill Web Console In Multiple tabs Using Drill 1.13
Hi Team, I am using Drill 1.13.0 version. I am facing below issues which were not there in 1.12.0 1. When I am submitting query I am not able to open Drill web-console in another window to monitor the currently running query status. 2. Not able to submit another query once a query is under running status as the console is getting stuck in running the first query. Please guide me how I could come out of these issues as I would like to use the latest version of Drill. Best regards, _ Tilak