Hello - and collect_set UDF?

2020-06-08 Thread John Omernik
) Any UDFs that will achieve this? Even if you've seen them on git hub etc, (i.e. not built into Drill) Thanks! John Omernik

Documentation Issue

2019-01-03 Thread John Omernik
Hey all, I was looking for some tools to help me add/subtract dates, and I went to: https://drill.apache.org/docs/date-time-functions-and-arithmetic/ TIMESTAMPDIFF and TIMESTAMPADD where there, and I got excited, I organized my query, and tried to run them and got the error below. So I tried

Nested Window Queries

2019-01-03 Thread John Omernik
Is there a limitation on nesting of of Window Queries? I have a query where I am using an event stream, and the changing of a value to indicate an event. (The state goes from disconnected, to charging, to complete, it reports many times in each of those states, but I am using lag(state, 1) over

Re: Time for a fun Query Question

2018-12-04 Thread John Omernik
out of a time field in string format. Still open to other ideas :) On Tue, Dec 4, 2018 at 10:05 AM John Omernik wrote: > Time for a fun question: How to be clever with queries! > > > I have a table that takes readings from an IoT type device > > > opt_id dt

Time for a fun Query Question

2018-12-04 Thread John Omernik
Time for a fun question: How to be clever with queries! I have a table that takes readings from an IoT type device opt_id dt ts eday opt_string 2.1.1 2018-12-01 10:43:43 12.5 1 2.1.2 2018-12-01 10:32:43 5.51 2.1.3

Re: [DISCUSS] Add Hadoop Credentials API support for Drill S3 storage plugin

2018-09-11 Thread John Omernik
I think in general Drill should consider the issue of credentials across storage plugins. Credentials exist for S3, but also Mongo, JDBC, and others as they get added. This can be a pain to manage and leads to insecurely setup Drill clusters. One option may be to allow a generic integration with

Re: Session handling with multiple drillbits

2018-09-05 Thread John Omernik
on was retained in Zookeeper, > but we weren't so sure about this. How is session information tracked > and maintained across multiple drillbits? > > Thanks so much for taking the time to engage on this! > > > > John Omernik wrote on 2018-09-05 9:40 AM: > > Rereading your pos

Re: Session handling with multiple drillbits

2018-09-05 Thread John Omernik
multiple drillbits? > > Thanks so much for taking the time to engage on this! > > > > > John Omernik wrote on 2018-09-05 9:40 AM: > >> Rereading your post, I think there is some concern between embedded/single >> drillbit mode, and distributed mode. >&

Re: Session handling with multiple drillbits

2018-09-05 Thread John Omernik
with a different store.format) and set it to csv. I think the key here is authenticated distributed mode in Drill and that will be how you will do what you need to do. John On Tue, Sep 4, 2018 at 7:30 PM, John Omernik wrote: > The session is the users session, not the drill bit. Since you ar

Re: Session handling with multiple drillbits

2018-09-04 Thread John Omernik
Are these ETL ish type queries? store.format should only apply when Drill is writing data, when it is reading, it uses the filenames and other hints to read. Thus, if you do HA, say with DNS (like like in the other thread) and prior to running your CREATE TABLE AS (I Am assuming this is what you

Re: Apache Drill High Availability using HAproxy

2018-08-27 Thread John Omernik
This is a great topic, that I have run into running Drill on Apache Mesos due to each of my bits having essentially a DNS load balancer. (One DNS Name, multiple Drill bits IPs assigned to them). That said, I've run into a few issues and have a few workarounds. Note, I am talking about the REST

Functions in Drill but not in documentation

2018-08-02 Thread John Omernik
Hey all, I was going to create a JIRA for this, but realized it's a symptom rather than a problem. I had a friend ask me "Hey, how do I do a regex search in Drill?" I thought, well, look in the docs. And I go to string functions and it's not there...(Doc page last updated Jan 14, 2016) Then I

Re: Apache Drill on Kubernetes

2018-07-26 Thread John Omernik
Absolutely it can! I don't have a docker file for it handy but I know it works well! On Thu, Jul 26, 2018, 10:45 AM Arjun Rao wrote: > Hi, > I am new to this forum and am excited to use Drill. This might have been > discussed in the past but I wanted to know if Apache Drill can be run on >

Re: Array Index Out of Bounds in String Binary

2018-07-18 Thread John Omernik
XX) in the > string representation, so 80 may work, 60 should reliably work. > > Thank you, > > Vlad > > > On 7/17/18 13:14, John Omernik wrote: > >> Yet this works? >> >> string_binary(byte_substr(`data`, 1, 80)) >> >> On Tue, Jul 17,

Re: Array Index Out of Bounds in String Binary

2018-07-17 Thread John Omernik
Yet this works? string_binary(byte_substr(`data`, 1, 80)) On Tue, Jul 17, 2018 at 3:12 PM, John Omernik wrote: > So on B. I found the BYTE_SUBSTR and only send 200 bytes to the > string_binary function, I still get an error. Something else is happening > here... > &g

Re: Array Index Out of Bounds in String Binary

2018-07-17 Thread John Omernik
t of bounds is not acceptable. Aborting with a clean message >>> about >>> to true problem might be fine, as would be to return a null. >>> >>> On Fri, Jul 13, 2018, 13:46 John Omernik wrote: >>> >>> So, as to the actual problem, I opened

Re: Array Index Out of Bounds in String Binary

2018-07-17 Thread John Omernik
ou, > > Vlad > > > On 7/13/18 14:01, Ted Dunning wrote: > >> There are bounds for acceptable behavior for a function like this. Array >> index out of bounds is not acceptable. Aborting with a clean message about >> to true problem might be fine, as would be to return a null. &g

Array Index Out of Bounds in String Binary

2018-07-13 Thread John Omernik
So, as to the actual problem, I opened a JIRA here: https://issues.apache.org/jira/browse/DRILL-6607 The reason I brought this here is my own curiosity: Does an issue in using this function most likely lie in the function code itself not handling good data, or is the issue in the pcap pluglin

Re: Way to "pivot"

2018-03-06 Thread John Omernik
a JIRA so that we can track it? > > > > On Tue, Mar 6, 2018 at 11:40 AM, John Omernik <j...@omernik.com> wrote: > > > > > Perfect. That works for me because I have a limited number of values, > I > > > could see that getting out of hand if the values w

Re: Way to "pivot"

2018-03-06 Thread John Omernik
arch 6, 2018 9:11 PM > > To: user@drill.apache.org > > Subject: Re: Way to "pivot" > > > > If the X, Y and Z is unique for each timestamp you can perhaps use group > > by (dt, X, Y , Z) and case to make the X, Y , Z columns. May be worth > > looki

Way to "pivot"

2018-03-06 Thread John Omernik
I am not sure if this is the right thing for what I am trying to do, but I have data in this formate source dtvalue X2018-03-06 11:00 0.31 X2018-03-06 12:00 0.94 X2018-03-06 13:00 0.89 X

Re: Looking for user feedback on DRILL-5741

2018-02-18 Thread John Omernik
I like this this idea in general. When running under orchestrators like Yarn, Marathon, or Kubernetes, it's true that those things that start drill "manage" memory, however, there exists issues in that you need to setup the variables in drill to not exceed the amount that orchestrators have

Re: Documentation Update for Drill-4286

2018-02-12 Thread John Omernik
t; > -Original Message- > From: John Omernik [mailto:j...@omernik.com] > Sent: Monday, February 12, 2018 12:44 PM > To: user <user@drill.apache.org> > Subject: Documentation Update for Drill-4286 > > I see in 1.12 this feature was added("Have an ability to put s

Documentation Update for Drill-4286

2018-02-12 Thread John Omernik
I see in 1.12 this feature was added("Have an ability to put server in quiescent mode of operation") https://issues.apache.org/jira/browse/DRILL-4286 Was the documentation updated? I couldn't find it. Perhaps we need a new section on the Drill Docs page "Administrate Drill" for this and other

Re: MapR Drill 1.12 Mismatch between Native and Library Versions

2018-02-09 Thread John Omernik
drill-1.10.0. > > > Thanks, > Sorabh > > > From: John Omernik <j...@omernik.com> > Sent: Friday, February 9, 2018 6:40:02 AM > To: user > Subject: Re: MapR Drill 1.12 Mismatch between Native and Library Versions

Re: MapR Drill 1.12 Mismatch between Native and Library Versions

2018-02-09 Thread John Omernik
apSizeLimit: 0 using: [plain] 2018-02-09 08:34:05,896 [main] INFO o.apache.drill.exec.server.Drillbit - Construction completed (3461 ms). On Fri, Feb 9, 2018 at 8:10 AM, John Omernik <j...@omernik.com> wrote: > So already, you have given me some things to work with. Knowing that there

Re: MapR Drill 1.12 Mismatch between Native and Library Versions

2018-02-09 Thread John Omernik
ting to the > > libraries available in /opt/mapr/lib/ > > > > I don't know the exact details of what all gets symlinked, but this step > > should have ensured that you don't see mismatch between the versions. > > > > That said... Support would be better equipp

MapR Drill 1.12 Mismatch between Native and Library Versions

2018-02-08 Thread John Omernik
I am running MapR's 1.12 drill on a node that only has posix client installed (and thus has a MapR client from that). I've recently had to work with MapR Support to get a fix to posix, and that fixed one issue, but now when I try to start a drill bit, I get this error. The fix was a small patch

Using a plugin on files with wrong extensions

2018-01-04 Thread John Omernik
Hello all - I was looking at using the PCAP plugin here, and I setup the type for pcap in the storage plugin. I went to go query data, and realized that my TB or so of packet captures from my malware lab were all recorded with Daemonlogger, and the extension on each file is the 10 digit unix

Re: Proposed Slack Channel for Drill Users & Devs

2018-01-04 Thread John Omernik
Yes Please! On Thu, Jan 4, 2018 at 11:36 AM, Robert Wu wrote: > Hi, > > I think someone created one a while back (under "drillers.slack.com"). > > Best regards, > > Rob > > -Original Message- > From: Charles Givre [mailto:cgi...@gmail.com] > Sent: Thursday, January

Timeframe on Apache Drill 1.12 in MapR Package?

2018-01-03 Thread John Omernik
Hey all, just wondering if MapR has a timeframe on getting 1.12 into the MEP? Thanks! John

Re: dotnet integration

2017-10-11 Thread John Omernik
Drill is a great tool, I am glad you are looking into it! I would say it highly depends on what you will be returning in individual queries. If your result sets are going to be huge (my rarely are) then using the MapR ODBC will likely be your best bet, and posting on the MapR Community Forums

Re: Error Messages that are difficult to parse.

2017-09-26 Thread John Omernik
his is a simple but important bug, since it seems we don't know > which column is triggering the Exception. In case there isn't an existing > bug, could you file one for this? > > > > -Original Message- > From: John Omernik [mailto:j...@omernik.com] > Sent: Monda

Re: Error Messages that are difficult to parse.

2017-09-25 Thread John Omernik
0 Error Text: SYSTEM ERROR: NumberFormatException: On Mon, Sep 25, 2017 at 1:40 PM, John Omernik <j...@omernik.com> wrote: > So I think I addressed the first one with > > select CASE when tbl.field.subfield[0] is null then '' else > tbl.field.subfield[0] end as myfield from table

Re: Error Messages that are difficult to parse.

2017-09-25 Thread John Omernik
: NumberFormatException: Which I am confused by because '' or the field isn't a number.. so not sure how to troubleshoot this one either.. John On Mon, Sep 25, 2017 at 1:14 PM, John Omernik <j...@omernik.com> wrote: > > So as a user, I got this > > > Error Returned - Code: 500 &

Error Messages that are difficult to parse.

2017-09-25 Thread John Omernik
So as a user, I got this Error Returned - Code: 500 Error Text: SYSTEM ERROR: IllegalArgumentException: You tried to read a [RepeatedInt] type when you are using a field reader of type [NullableIntReaderImpl]. It's a JSON dataset, the record exists in some row, and not in others, but I have no

Re: Workaround for drill queries during node failure

2017-09-13 Thread John Omernik
As long as the nodes are "up" during the planning phase they will be included in the planning. If they go "down" after planning, (i.e. during execution) and fragments are requested, they will not report, and will fail the query. So if you start off with 5 nodes, but node 4 is down for patches,

Re: Does Drill Use Apache Struts

2017-09-08 Thread John Omernik
Also, thank you for the pointer to the pom.xml On Fri, Sep 8, 2017 at 9:41 AM, John Omernik <j...@omernik.com> wrote: > So, I thought I was clear that it was unverified, but I also I am in cyber > security research, and this is what is being discussed in closed circles. I >

Re: Does Drill Use Apache Struts

2017-09-08 Thread John Omernik
exploit (three of 'em, actually). > > I do this for a living (cybersecurity research). > > Drill is not impacted which can be verified by looking at dependencies > in https://github.com/apache/drill/blob/master/pom.xml > > On Fri, Sep 8, 2017 at 10:12 AM, John Omernik

Re: Does Drill Use Apache Struts

2017-09-08 Thread John Omernik
t; Almost certainly not. > > What issues are you referring to? I don't follow struts. > > > On Sep 8, 2017 16:00, "John Omernik" <j...@omernik.com> wrote: > > Hey all, given the recent issues related to Struts, can we confirm that > Drill doesn't use this Ap

Does Drill Use Apache Struts

2017-09-08 Thread John Omernik
Hey all, given the recent issues related to Struts, can we confirm that Drill doesn't use this Apache component for anything? I am not good enough at code reviews to see what may be used. John

Re: Performance on Unions - Some ideas please!

2017-08-23 Thread John Omernik
.. however, it would be harder to delineate when one day started and another ended in the avro data... 2. I would have to orchestrate both the CTAS and the View Update outside of Drill, not a huge pain, but I like self contained setups :) John On Wed, Aug 23, 2017 at 7:48 AM, John Omernik <j..

Performance on Unions - Some ideas please!

2017-08-23 Thread John Omernik
I have a streaming process that writing to an avro table, with Schema etc. It's coming from BroIDS Connection logs, so my table name is like this: broconnavro/-MM-DD Basically I take any data that has come in on that day and put it into a dated folder in Avro format. 1. Avro support is

Re: Avro - Let's talk Avro again

2017-08-22 Thread John Omernik
r reporting is terrible > > >- Schema change reporting is almost absent > > >- Avro schema is fixed/strict even though text formats support > > > evolving/variable schema (With all sorts of side effects) > > >- Avro still does not support dirN > &g

Re: Merge and save parquet files in Drill

2017-08-17 Thread John Omernik
Also, what is the cardinality of the partition field? If you have lots of partitions, you will have lots of files... On Thu, Aug 17, 2017 at 9:55 AM, Andries Engelbrecht wrote: > Do you partition the table? > You may want to sort (order by) on the columns you partition,

Re: Avro - Let's talk Avro again

2017-08-17 Thread John Omernik
at! > > Regards, > -Stefán > > On Thu, Aug 17, 2017 at 12:37 PM, John Omernik <j...@omernik.com> wrote: > > > I know Avro is the unwanted child of the Drill world. (I know others have > > tried to mature the Avro support and that has been something that still > i

Avro - Let's talk Avro again

2017-08-17 Thread John Omernik
I know Avro is the unwanted child of the Drill world. (I know others have tried to mature the Avro support and that has been something that still is in a "experiemental" state. That said, isn't it time for us to clean it up? I am sure I there are some open JIRAs out there, (last Doc update on

Re: Querying Data with period in name

2017-08-11 Thread John Omernik
s://issues.apache.org/jira/browse/DRILL-4264> is not > fixed, you can try to do *select `**id.orig_h`*. It should not throw the > error. > > Kind regards, > Volodymyr Vysotskyi > > 2017-08-11 21:07 GMT+03:00 John Omernik <j...@omernik.com>: > > > Hey all, &g

Querying Data with period in name

2017-08-11 Thread John Omernik
Hey all, I am querying some json and parquet data that has dots in the name. Not all the data I may be querying will come from Drill, thus dot is a valid character... when I go to initially explore my data, Drill throws the error below when I run a select * query. I understand the error, and I

Elastic Search Plugins

2017-07-28 Thread John Omernik
Is there any work being done on an Elastic Search plugin? That would be a huge benefit to the community! I see there are some older JIRAs... anything else? https://issues.apache.org/jira/browse/DRILL-3637 https://issues.apache.org/jira/browse/DRILL-3790

Re: How much more information can error messages provide?

2017-07-28 Thread John Omernik
es more friendly moving forward with any new features that ship. > > > > Please share the JIRA that you create. But a holistic approach scares me. > > How would we prioritize the ones that would impact most users? Any > thoughts > > on that. > > > > Saurabh >

Re: How much more information can error messages provide?

2017-07-27 Thread John Omernik
messages! John On Thu, Jul 27, 2017 at 8:25 AM, John Omernik <j...@omernik.com> wrote: > I want to bump this up. I've had a number of troubleshooting times where > getting more concise error messages would really help me deal with my data. > It looks like Dan found verbose mode, but so

Re: How much more information can error messages provide?

2017-07-27 Thread John Omernik
I want to bump this up. I've had a number of troubleshooting times where getting more concise error messages would really help me deal with my data. It looks like Dan found verbose mode, but sometimes verbose isn't what we need, but concise. Hey Dan, maybe we could come up with a Jira that is is

Re: Rest API - Need to Improve

2017-07-12 Thread John Omernik
ch (mucked about with it a year > ago, but have gotten rusty since.) There is an easy way to indicate the > HTTP status; I just can’t remember what it is… > > Anyone else remember how to set the return status in a Jetty response? > > Thanks, > > - Paul > > > > On Jul 7,

Rest API - Need to Improve

2017-07-07 Thread John Omernik
Hello all, I recently setup some notebooks using the Rest API. I found that I was using Drill 1.8, and my code for determining authentication in Python, while hacky, worked... What I found is using python requests, when I posted to j_security check, the requests object almost always returned a

Re: Drill Session ID between Nodes

2017-06-23 Thread John Omernik
-718-0098 > MapR Technologies > http://www.mapr.com > > > > On Jun 23, 2017, at 11:33 AM, John Omernik <j...@omernik.com<mailto:john@ > omernik.com>> wrote: > > So a few things > > 1. The issue is that as is, SSL stuff works fine, but when the IP a

Re: Drill Session ID between Nodes

2017-06-23 Thread John Omernik
, this is how MapR's maprlogin works with HTTPS even though we use > IP address by default. > > Keys > ___ > Keys Botzum > Distinguished Engineer, Field Engineering > kbot...@maprtech.com<mailto:kbot...@maprtech.com> > 443-718-0098

Re: Drill Session ID between Nodes

2017-06-23 Thread John Omernik
> /drill2.mydrill.corp.com>, drill3.mydrill.corp.com<http:/ > /drill3.mydrill.corp.com>, drill4.mydrill.corp.com<http:/ > /drill4.mydrill.corp.com>, > this is bad with wildcards: drill1, drill2, drill3, drill4 > > > Keys > ___ > Keys Botzum &g

Re: Drill Session ID between Nodes

2017-06-22 Thread John Omernik
r@drill.apache.org > Subject: Re: Drill Session ID between Nodes > > Hi John, > > I do not believe that session IDs are global. Each Drillbit maintains its > own concept of sessions. A global session would require some centralized > registry of sessions, which Drill does not hav

Drill Session ID between Nodes

2017-06-22 Thread John Omernik
When I log onto a drill node, and get Session Id, if I connect to another drill node in the cluster will the session id be valid? I am guessing not, but want to validate. My conumdrum, I have my Drill cluster running in such a way that the connections to the nodes are load balanced via DNS.

Re: Using Apache Drill with AirBnB SuperSet

2017-06-14 Thread John Omernik
t project looks pretty neat. > > Didn't realize that there is also a Python Driver for Drill. I'd think > that would be useful too. > > -Original Message----- > From: John Omernik [mailto:j...@omernik.com] > Sent: Wednesday, June 14, 2017 3:45 AM > To: user <user@drill

Using Apache Drill with AirBnB SuperSet

2017-06-14 Thread John Omernik
Hey all, I've had some success getting Drill working with Superset (Visualization tool) via work done by a few people (not just me) I thought I'd share what I was using and how it was working and see if others benefited from it, or if issues occur they can be posted to the issue to improve things.

Re: Parquet, Arrow, and Drill Roadmap

2017-05-04 Thread John Omernik
I've created a JIRA on this request. The idea here being some higher level descriptions of these projects (I included Calcite in the JIRA too), what they do for the project, what the current state of integration is, what options we have for future states, and what benefits those future states

Re: Discussion: Comments in Drill Views

2017-05-02 Thread John Omernik
I created a JIRA for this based on the Hangout today! https://issues.apache.org/jira/browse/DRILL-5461 On Mon, Mar 6, 2017 at 7:55 AM, John Omernik <j...@omernik.com> wrote: > I can see both sides. But Ted is right, this won't hurt any thing from a > performance perspective, even

Re: Clarification on Drill Options

2017-05-02 Thread John Omernik
Looks like some work has been done here, any chance we can move this along? https://issues.apache.org/jira/browse/DRILL-4699 Thanks! On Tue, May 31, 2016 at 12:51 PM, John Omernik <j...@omernik.com> wrote: > I added a JIRA related to this: > > https://issues.apache.org/jira/br

Re: Dealing with bad data when trying to do date computations

2017-05-02 Thread John Omernik
I just want to say, there is a great JIRA already opened here: https://issues.apache.org/jira/browse/DRILL-4258 I added a comment, I would encourage others to add comments if they think this idea would be beneficial. On Wed, Mar 1, 2017 at 8:50 AM, John Omernik <j...@omernik.com> wrote:

Parquet, Arrow, and Drill Roadmap

2017-05-01 Thread John Omernik
Hey all - I posted this to both dev and user as I could mentally make the argument for both, Sorry if this is answered somewhere already. I know in the past, there have been discussions around using two different readers for Parquet, and performance gains/losses, issues. etc. Right now, the

Re: NPE When Selecting from MapR-DB Table

2017-04-12 Thread John Omernik
t; fragments on all nodes. > One thing you can try is lower the slice target so we create fragments on > all nodes. > Depending upon your configuration, making it close to average rowCount per > region might > be the ideal thing to do. > > Thanks, > Padma > > > On Apr 6

Re: NPE When Selecting from MapR-DB Table

2017-04-06 Thread John Omernik
By the way, are there any ways to manually prod the data to make it so the queries work again? It seems like an off by one type issue, can I add something to my data make it work? On Thu, Apr 6, 2017 at 2:03 PM, John Omernik <j...@omernik.com> wrote: > Oh nice, 1.10 from MapR has the f

Re: NPE When Selecting from MapR-DB Table

2017-04-06 Thread John Omernik
ix for DRILL-5395) > should be out shortly, within the next week or so. > > On Thu, Apr 6, 2017 at 11:50 AM, John Omernik <j...@omernik.com> wrote: > > > Nope no other issues. I was waiting on 1.10 to be available from MapR, do > > we know the release date for 1.

Re: NPE When Selecting from MapR-DB Table

2017-04-06 Thread John Omernik
with MapR-DB tables (when built with mapr profile). Let > us know if you are having any trouble. > > The fix for DRILL-5395 should be available this week, afaik. You could also > build off Padma's branch if you need it urgently. > > On Thu, Apr 6, 2017 at 11:25 AM, John Om

Re: NPE When Selecting from MapR-DB Table

2017-04-06 Thread John Omernik
gir...@apache.org> wrote: > Could be related to DRILL-5395 > <https://issues.apache.org/jira/browse/DRILL-5395>. Once committed, the > fix > should be available in Apache master. > > On Thu, Apr 6, 2017 at 10:56 AM, John Omernik <j...@omernik.com> wrote: > > >

NPE When Selecting from MapR-DB Table

2017-04-06 Thread John Omernik
Hello all, I am using Drill 1.8 and MapR 5.2. I just finished a large load of data into a mapr table. I was able to confirm that the table returns data from the c api for hbase, so no issue there, however, when I select from the table in Drill, either from the table directly, or from a view I

Re: Rest API Authentication (1.5 Feature)

2017-04-04 Thread John Omernik
single request similar to what Venki suggested. It would then be stateless (if you have to do a login request prior to your query request, it's not really a stateless call) and work as people would expect a rest request to work. Thoughts? John On Wed, Dec 23, 2015 at 12:26 PM, John Omer

Re: Drill Parquet Partitioning Method

2017-04-04 Thread John Omernik
essly work with other ecosystems. > > If we do that, I'm not sure if Drill could error out for "select * > from mytable where day = '2017-04-01' ", if there is no "day" field in > the directory names. The thing is day could come from either > directory, or from

Drill Parquet Partitioning Method

2017-04-03 Thread John Omernik
So as a user of Drill now for a while, I have gotten used to the idea of partitions just being values, instead of key=value like other things (hive, impala, others). >From a user/analyst perspective, the dir0, dir1, dirN methodology provides quite a bit of flexibility, but to be intuitive, we

Re: Discussion: Comments in Drill Views

2017-03-06 Thread John Omernik
it really necessary to put a technical limit in to prevent people from > > OVER-documenting views? > > > > > > What is the last time you saw code that had too many comments in it? > > > > > > > > On Thu, Mar 2, 2017 at 8:42 AM, John Omernik <j...@om

Re: [Drill 1.9.0] : [CONNECTION ERROR] :- (user client) closed unexpectedly. Drillbit down?

2017-03-06 Thread John Omernik
ilar configuration params for Drill > 1.6 where this issue was not coming. > Anything else which i can try? > > Regards, > *Anup Tiwari* > > On Fri, Mar 3, 2017 at 11:01 PM, Abhishek Girish <agir...@apache.org> > wrote: > > > +1 on John's suggestion. > > >

Re: [Drill 1.9.0] : [CONNECTION ERROR] :- (user client) closed unexpectedly. Drillbit down?

2017-03-03 Thread John Omernik
P="16G" > > And all other variables are set to default. > > Since we have tried some of the settings suggested above but still facing > this issue more frequently, kindly suggest us what is best configuration > for our environment. > > Regards, > *Anup Tiwari* > >

Re: Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-03 Thread John Omernik
Can you help me understand what "local to the cluster" means in the context of a 5 node cluster? In the plan, the files are all file:// Are the files replicated to each node? is it a common shared filesystem? Do all 5 nodes have equal access to the 10 files? I wonder if using a local FS in a

Re: Discussion: Comments in Drill Views

2017-03-02 Thread John Omernik
unal Khatua > > Engineering > > [MapR]<http://www.mapr.com/> > > www.mapr.com<http://www.mapr.com/> > > > From: John Omernik <j...@omernik.com> > Sent: Wednesday, March 1, 2017 9:55:27 AM > To: user > Subject: Re:

Re: [Drill 1.9.0] : [CONNECTION ERROR] :- (user client) closed unexpectedly. Drillbit down?

2017-03-01 Thread John Omernik
Another thing to consider is ensure you have a Spill Location setup, and then disable hashagg/hashjoin for the query... On Wed, Mar 1, 2017 at 1:25 PM, Abhishek Girish wrote: > Hey Anup, > > This is indeed an issue, and I can understand that having an unstable > environment

Re: Drill Views and Typed Columns

2017-03-01 Thread John Omernik
ny records are > affected of a certain type/change/etc. A endless list of possibilities. It > should be feasible to utilize Drill as the execution engine with a smart > tool on top of it to process. > > --Andries > > > > On May 16, 2016, at 4:08 PM, John Omern

Re: Discussion: Comments in Drill Views

2017-03-01 Thread John Omernik
sting. I love docstrings in Lisp and Python and Javadoc > in Java. > > Basically this is like that, but for SQL. Very helpful. > > On Thu, Jun 23, 2016 at 11:48 AM, John Omernik <j...@omernik.com> wrote: > > > I am looking for discussion here. A colleague was asking me

Re: debugging bad input

2017-03-01 Thread John Omernik
The first thing I would try is turning on verbose errors. The setting for that is exec.errors.verbose I use select * from sys.options quite a bit when determining how to approach problems. To alter your session to use verbose errors type ALTER SESSION set `exec.errors.verbose` = true; Then

Re: Dealing with bad data when trying to do date computations

2017-03-01 Thread John Omernik
est with a function, what will need to be included for your pull request to be approved? We should not allow functions to be added to Drill without a basic doc addition. 4. Update procedures? This is complicated, but done well, it could really put the knowledge and analyst needs right in the syste

Re: Odd Issue with to_date function

2017-02-28 Thread John Omernik
if they find this thread, they may not have the ability to make this change (ranging from they don't have the permissions, to admins just say no because of the other affects) We need a way to handle this at the session level. a On Fri, Apr 1, 2016 at 6:15 PM, John Omernik <j...@omernik.com>

Spill Location, permissions and Authentication

2017-02-28 Thread John Omernik
I am using 1.8 as of this time, if this is fixed in a newer version, please let me know. I am running drill as a master user. (mapr) and have enabled authentication. When I authenticate to drill, and try to run a query, specifically one with hash joins turned off (thus using spill to disk), I am

Re: Dealing with bad data when trying to do date computations

2017-02-28 Thread John Omernik
You could also generate documentation updates via query at each release. This would be a great feature, move the information close to the analysts hands, I love how that would work. (I think I remember some talk about extending sys.options to be self documenting as well ) On Tue, Feb 28,

Re: Dealing with bad data when trying to do date computations

2017-02-28 Thread John Omernik
documented up to now) and then then get processes in place to ensure ongoing updates. Thanks John Omernik On Tue, Feb 28, 2017 at 10:15 AM, Charles Givre <cgi...@gmail.com> wrote: > Hi John, > I believe that Drill 1.9 includes a REGEXP_MATCHES( , ) > function which does what

Dealing with bad data when trying to do date computations

2017-02-28 Thread John Omernik
I have a data set that has birthdays in -MM-DD format. Most of this data is great. I am trying to compute the age using EXTRACT(year from age(dob)) But some of my data is crapola... let's call it alternative data... When I try to run the Extract function, I get Error: SYSTEM ERROR:

Re: Drill and Elasticsearch

2017-02-23 Thread John Omernik
21, 2017 at 10:32 PM, John Omernik <j...@omernik.com> wrote: > > > > > I guess, I am just looking for ideas, how would YOU get data from > Parquet > > > files into Elastic Search? I have Drill and Spark at the ready, but > want to > > > be able to handle

Drill and Elasticsearch

2017-02-21 Thread John Omernik
I know there are conversations about an Elasticsearch plugin, however, I had a recent need to take some data that was accessible in Drill (stored as a Parquet table) and move it into an Elasticsearch index. There are about 1 million rows in the source data. I was learning the technologies here,

Re: ideal drill node size

2017-02-06 Thread John Omernik
I think you would be wasting quite a bit of your server if you split it up into multiple vms. Instead, I am thinking a larger drill bit size wise (ensure you are upping your ram as high as you can) would be best. Note I am not an expert on this stuff, I would like an experts take as well. Here is

Re: IndexR, a new storage plugin for Drill

2017-01-03 Thread John Omernik
This looks very interesting! Can't wait to see some how-to's to get the the server nodes setup, and kafka pipelines setup. I'd be very interested in trying this once it's setup. Thanks! On Tue, Jan 3, 2017 at 2:35 AM, WeiWan wrote: > IndexR is a distributed, columnar

Re: Batch load of unstructured data in Drill

2016-12-08 Thread John Omernik
Sure... I believe you could CTAS from your json directory into a tmp parquet directory and then move the resultant files into the final parquet directory i.e. Drill Query: Create table `.mytempparq` as select * from `.mytempjson` Filesystem command: mv ./mytempparq/* ./myfinalparq It would be

NPE on Select with Options on CSV File

2016-12-08 Thread John Omernik
Hey all, I am trying to do a select with options on a CSV file. select columns[0], columns[1] already works for this data. Ideally, I am trying to do a select * from table(dfs.root.`path/to/data.csc'(type => 'text', extractHeader => true, fieldDelimiter => ',') limit 10 and have it work and

SQL Line Formating

2016-12-08 Thread John Omernik
Hey all, I have a puzzler (I think). I have a directory with JSON, it's great, it queries well, it's well formatted. I created a view on that directory. Added some columns (like a timestamp version of the EPOCH time field) When I run a query in SQL Line of the view, I get a well formatted

Re: Slow query on parquet imported from SQL Server while the external SQL server is down.

2016-12-01 Thread John Omernik
plugin > isn't > > > configured correctly or the underlying datasource is not up, this could > > > drastically slow down the query execution time. > > > > > > I'll look up to see if we have a JIRA for this already - if not will > file > > > one. &

Re: Slow query on parquet imported from SQL Server while the external SQL server is down.

2016-11-30 Thread John Omernik
So just my opinion in reading this thread. (sorry for swooping in an opining) If a CTAS is done from any data source into Parquet files there should be NO dependency on the original data source to query the resultant Parquet files. As a Drill user, as a Drill admin, this breaks the concept

  1   2   3   4   5   >