Re: Does Drill support the IIF or CASE WHEN operators?

2021-06-24 Thread Jason Altekruse
Hi Rob, Drill definitely supports CASE WHEN, here is the short doc page for it. https://drill.apache.org/docs/case/ Jason Altekruse On Thu, Jun 24, 2021 at 3:52 PM Lehrbass, Robert < rlehrb...@broadviewsoftware.com> wrote: > Hi there, > > Just a small question - during our eval

Re: [Drill 1.6] : Number format exception due to Empty String

2016-10-15 Thread Jason Altekruse
at the system level. CASE WHEN trim(column_a) = '' THEN NULL ELSE cast(column_a as INTEGER) END [1] - https://issues.apache.org/jira/browse/DRILL-3363 Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Sat, Oct 15, 2016 at 6:47 AM, Khurram Faraaz <kfar...@maprtech.com>

Re: Suggestions for hangout topics for 08/09

2016-08-08 Thread Jason Altekruse
Yeah, I can join the hangout tomorrow to talk about the PR, thanks for the heads up. Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Mon, Aug 8, 2016 at 12:09 PM, Zelaine Fong <zf...@maprtech.com> wrote: > Jason -- will you be able to join tomorrow's hangout, sinc

Re: question about drill

2016-07-06 Thread Jason Altekruse
If you want to combine files from different directories where there are no patterns in the file or directory names, you could use a UNION ALL to combine datasets [1]. [1] - https://drill.apache.org/docs/select-union/ Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Wed, Jul

Re: gzipped json files not named .json.gz

2016-06-28 Thread Jason Altekruse
rename, as I would think other tools would have trouble reading these as well. Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Tue, Jun 28, 2016 at 2:48 PM, Parth Chandra <pchan...@maprtech.com> wrote: > Yes, I believe that would work if the file is not compressed. &g

Re: Scaling Drill with s3 plugin

2016-06-28 Thread Jason Altekruse
You can also configure storage plugins with the rest API. [1] [1] - https://drill.apache.org/docs/rest-api/ Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Tue, Jun 28, 2016 at 12:04 PM, Scott Kinney <scott.kin...@stem.com> wrote: > From what I can tell the s3 pl

Re: Merging files

2016-06-23 Thread Jason Altekruse
Apply a sort in your CTAS, this will force the data down to a single stream before writing. Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Thu, Jun 23, 2016 at 10:23 AM, John Omernik <j...@omernik.com> wrote: > When have a small query writing smaller data (like

Re: How to extend system or session options

2016-06-13 Thread Jason Altekruse
/jira/browse/DRILL-4047 Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Mon, Jun 13, 2016 at 10:55 AM, Jiang Wu <jiang...@numerxdata.com> wrote: > I meant runtime options that are settable via: > > ALTER SYSTEM SET 'xyz' = true > ALTER SESSION SET 'xyz' = fal

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2016-05-25 Thread Jason Altekruse
Congrats Parth! Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Wed, May 25, 2016 at 10:39 AM, rahul challapalli < challapallira...@gmail.com> wrote: > Congratulations Parth! > > Thank You Jacques for your leadership over the last few years. > > On We

Re: CTAS Out of Memory

2016-05-13 Thread Jason Altekruse
I am curious if this is a bug in the JDBC plugin. Can you try to change the output format to CSV? In that case we don't do any large buffering. Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Fri, May 13, 2016 at 10:35 AM, Stefan Sedich <stefan.sed...@gmail.com>

Re: CTAS Out of Memory

2016-05-13 Thread Jason Altekruse
we are using still buffers a non-trivial amount of data into heap memory when writing parquet files. Try raising your JVM heap memory in drill-env.sh on startup and see if that prevents the out of memory issue. Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Fri, May 13, 2016

Re: Drill (CTAS) Default hadoop Replication factor on HDFS ?

2016-05-10 Thread Jason Altekruse
to investigate the current design further and fix the bug [1]. [1] - https://issues.apache.org/jira/browse/DRILL-4663 Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Tue, May 10, 2016 at 12:54 AM, Shankar Mane <shankar.m...@games24x7.com> wrote: > Thanks Abhish

Re: Regarding Join Query Problem

2016-04-21 Thread Jason Altekruse
am using two different storage plugin name with same credential it works properly. *Is it possible to join two table using same storage plugin name? If yes,then What am I doing wrong in this query?* Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Thu, Apr 21, 2016 at 8:15 AM

Re: Unable to connect to S3 parquet data using Drill

2016-04-20 Thread Jason Altekruse
m wherever it puts metadata in the bucket (maybe a hidden file or something?) and it isn't finding it. This makes me think that your bucket isn't set up as it is expected to be for a connection using s3://. [1] - https://wiki.apache.org/hadoop/AmazonS3 Jason Altekruse Software Engineer at Dremio Ap

Re: Unable to connect to S3 parquet data using Drill

2016-04-20 Thread Jason Altekruse
Which version of Drill are you running? The config block for adding your credentials was added in a recent release, I believe 1.5. Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Wed, Apr 20, 2016 at 1:38 PM, Nick Monetta <ni...@inrix.com> wrote: > Copying and pas

Re: Question on querying parquet files

2016-04-14 Thread Jason Altekruse
by padding). Here is some relevant discussion on the topic from the parquet list [2]. [1] - http://alluxio.org/ (formerly Tachyon) [2] - https://mail-archives.apache.org/mod_mbox/parquet-dev/201601.mbox/%3CCA+CA-8vN4heWcLc7=fzp==z895whyaucnd8jdq6q4-spt0j...@mail.gmail.com%3E Jason Altekruse Software

Re: Continued Avro Frustration

2016-04-01 Thread Jason Altekruse
definition of the task of writing a format plugin. This is a community contribution that should be easier and more strongly encouraged than it is today, and could really help new users adopt Drill if they are using other data formats. Jason Altekruse Software Engineer at Dremio Apache Drill Committer

Re: How to modify connection timeout delay ?

2016-03-24 Thread Jason Altekruse
find in the logs? To adjust the timeout you can set a higher value for the drill.exec.user.timeout in conf/drill-override.conf, the value is specified in seconds. Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Thu, Mar 24, 2016 at 9:20 AM, COUERON Damien (i-BP - MICROPOLE

Re: Moving to HBase 1.1 [DRILL-4199]

2016-03-21 Thread Jason Altekruse
With the recent issues that have been discussed on other threads related to correctness issues when using our current client I agree we should upgrade. +1 Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Mon, Mar 21, 2016 at 1:18 PM, Aditya <a...@apache.org> wrote:

Hangout happening now!

2016-03-15 Thread Jason Altekruse
Hello All, Join us to discuss the latest happenings in Drill! https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc Jason Altekruse Software Engineer at Dremio Apache Drill Committer

Re: NumberFormatException with cast to double?

2016-03-10 Thread Jason Altekruse
and provides your default value of 0 in that case. - Jason Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Thu, Mar 10, 2016 at 2:32 PM, Matt <bsg...@gmail.com> wrote: > Have some CSV data that Drill 1.5 selects as-is without any problems, > until I attempt to

Re: [E] Re: binary file searching

2016-03-08 Thread Jason Altekruse
ed tool I use to read the files today is written in C. It > converts the files to ASCII CSV and prints to STDOUT. Would that work in a > format plugin? > > Thanks, > Scott Wilburn > > > -----Original Message- > From: Jason Altekruse [mailto:altekruseja...@gmail.com] >

Re: binary file searching

2016-03-08 Thread Jason Altekruse
Not quite sure what you mean by "compiled tool". I'm guessing the most efficient way to accomplish something like this would be to write a format plugin that would call out to this external code and then read the result with the Drill CSV reader. Is this external tool written in Java or another

Re: Parallelism / data locality in HDFS/MapRFS

2016-03-08 Thread Jason Altekruse
Considering your description of the data, 1.5 GB per file with only 500 records in each give you somewhere around 30 MB records. This in itself doesn't necessarily cause an issue, but the structure of your example record makes me think you may have many individual columns in the nested structure,

Re: Drill querying S3

2016-03-07 Thread Jason Altekruse
1. Yes you can query S3 from a drill cluster in AWS 2. There is a good chance you would see better performance putting the data as close to Drill as possible, depending on your workload and dataset you might encounter a case where this doesn't matter as much, but most typical cases will benefit

Re: Drill Bit Heap Space Issues

2016-03-01 Thread Jason Altekruse
To_date takes a long assumed to be a unix timestamp, so the error you are getting here is from an implicit cast trying to turn the string into a long before converting it to a date. You can provide a second parameter to tell it how you would like to parse your string to properly parse these kinds

Re: extractHeader in session variable?

2016-02-29 Thread Jason Altekruse
You need to specify the delimiter, it doesn't seem to default to comma as the field delimiter. On Mon, Feb 29, 2016 at 11:46 AM, Christopher Matta wrote: > Actually I spoke too soon. > > Running what Jacques suggested returned a single column for all the data, > with all the

Re: Avro support in Drill - Missing support for the IN operator and other frustrating things

2016-02-26 Thread Jason Altekruse
Stefan, I'm sorry that we have not been better about getting back to the issues you have filed against the Avro reader. We do appreciate all of the effort you have put into filing thorough bugs and being active in the discussions on the list. I have responded on the bug you filed on this issue

Re: Hangout Happening Now!

2016-02-23 Thread Jason Altekruse
try restarting the > hangout? > > Parth > > On Tue, Feb 23, 2016 at 10:04 AM, Jason Altekruse < > altekruseja...@gmail.com> > wrote: > > > https://plus.google.com/hangouts/_/dremio.com/drillhangout?authuser=1 > > >

Re: Query Return Error because of a single file

2016-02-23 Thread Jason Altekruse
Hello François, Sorry that this question went unanswered for so long. We have gotten many requests for this feature of skipping bad files, but we haven't come to a consensus of how this feature should be implemented. The problem largely comes out of the ambiguity of the definition of skipping

Re: [E] Re: Multiple Delimiter Format

2016-02-19 Thread Jason Altekruse
Don't know how big your data is, as this won't be the most efficient solution, but there is a hack that can enable this without modifying Drill. Create a new format that actually won't parse anything but the lines, put in a delimiter that doesn't appear in your dataset. "not_csv": {

Re: Can't build the custom files for Apache Drill 1.5.0

2016-02-17 Thread Jason Altekruse
lly takes some while to propagate to mirrors. > > On Wed, Feb 17, 2016 at 11:38 AM, Kumiko Yada <kumiko.y...@ds-iq.com> > wrote: > > > I just tried it, and I'm still seeing the same issue. > > > > -Kumiko > > > > -Original Message- > > F

Re: Can't build the custom files for Apache Drill 1.5.0

2016-02-17 Thread Jason Altekruse
I had forgotten to finish publishing the Maven artifacts, I just completed the artifact release process and verified that I could retrieve the artifacts from the Maven repository with new UDF project. Apologies for the oversight, please respond back if you still see issues. On Wed, Feb 17, 2016

[ANNOUNCE] Apache Drill 1.5.0 released

2016-02-16 Thread Jason Altekruse
On behalf of *Apache* *Drill* community, I am happy to *announce* the release of *Apache* *Drill* 1.5.0. The source and binary artifacts are available at [1] Review a complete list of fixes and enhancements at [2] This release of *Drill* fixes many issues and introduces a number of enhancements,

Hangout happening now!

2016-02-16 Thread Jason Altekruse
Join us to hear the latest news on Drill, anyone with an interest in Drill is welcome to join. https://plus.google.com/hangouts/_/dremio.com/drillhangout - Jason

Re: Passing json/map as string to UDF

2016-02-11 Thread Jason Altekruse
I think what you are looking for is covert_to( column_name, 'JSON') That being said, are you going to be parsing this JSON in the function? I think it would make more sense to just have the function take the complex input. The only case where I would suggest taking in JSON, and parsing it in a

Re: Source for drill's calcite?

2016-02-09 Thread Jason Altekruse
I can't find the latest version either, but this is the r9 branch. I don't think any very major changes happened in the last update (it's likely just back-porting a fix from calcite master). So you can base your work on this branch and rebase it when someone points you to the updated branch.

Hangout starting now!

2016-02-09 Thread Jason Altekruse
Join us to hear the latest Drill news or to bring up any concerns you would like to see addressed. https://plus.google.com/hangouts/_/dremio.com/drillhangout?authuser=1

Re: Analyse web server logs in Drill

2016-02-07 Thread Jason Altekruse
For searching the mail archives, I use this site[1]. It doesn't seem to have any official association with Apache, but it gets the job done when I have deleted the thread from my e-mail. [1] - http://search-hadoop.com/ On Sat, Feb 6, 2016 at 8:35 PM, Jacques Nadeau wrote: >

Please vote for proposed Drill talks for the Hadoop Summit

2016-02-05 Thread Jason Altekruse
Hello Drillers, There are some great proposed talks for this year's Hadoop summit related to Drill. Please help to promote Drill in the wider Big Data community by taking a look through the list and voting for talks that sound good. You don't need to register or anything to vote, it just asks

Re: Creating a single parquet or csv file using CTAS command?

2016-02-04 Thread Jason Altekruse
...? > > Thank you. > > > > > > > > > > > > > > On Thu, Feb 4, 2016 at 2:40 PM, Jason Altekruse <altekruseja...@gmail.com> > wrote: > > > Are you even trying to write parquet files? in your original post you > said > > you are

Re: Parquet drill date fields

2016-02-04 Thread Jason Altekruse
Hi Stefan, There is a reason that dictionary is disabled by default. The parquet-mr library we leverage for writing parquet files currently has the behavior to write nearly all columns as dictionary encoded for all types when dictionary encoding is enabled. This includes columns with integers,

Re: Creating a single parquet or csv file using CTAS command?

2016-02-04 Thread Jason Altekruse
Are you even trying to write parquet files? in your original post you said you are writing CSV files, but then gave files with parquet extensions as what you are trying to concatenate. I'm a little confused though if you are not working with tools for big data, concatenating parquet files is not

Re: REGEX search Operator

2016-02-04 Thread Jason Altekruse
I didn't realize that we were lacking this functionality. As the repeated_contains operator handles wildcards it makes sense to add such a function to drill. It should be simple to implement, would someone like to open a JIRA and submit a PR for this? - Jason On Tue, Feb 2, 2016 at 8:56 AM,

Re: REGEX search Operator

2016-02-04 Thread Jason Altekruse
Awesome, thanks! On Thu, Feb 4, 2016 at 7:44 AM, Nicolas Paris <nipari...@gmail.com> wrote: > Well I am creating a udf > good exercise > I hope a PR soon > > 2016-02-04 16:37 GMT+01:00 Jason Altekruse <altekruseja...@gmail.com>: > > > I didn't realize tha

Re: REGEX search Operator

2016-02-04 Thread Jason Altekruse
; 2. do I need a jira ? how proceed ? > > For now, I only published it on my github account in a separate project > > Thanks > > 2016-02-04 16:52 GMT+01:00 Jason Altekruse <altekruseja...@gmail.com>: > > > Awesome, thanks! > > > > On Thu, Feb 4, 2016 at 7:44

Re: REGEX search Operator

2016-02-04 Thread Jason Altekruse
; > > '\d\d\d\d-\d\d-\d\d' needed to be '\\d\\d\\d\\d-\\d\\d-\\d\\d' of we can > > avoid that it would be AWESOME. > > > ​My guess is this comes from java way to handle strings. All langages I > have used need to double escape.​ > > > > On Thu, Feb 4, 2016 a

Re: Avro reader - Possible regression in 1.5-SNAPSHOT

2016-02-02 Thread Jason Altekruse
I made a comment on the JIRA about a possible explanation for this, it seems like a configuration/classpath issue. I would recommend moving the discussion there because it doesn't always forward JIRA updates to the list. On Tue, Feb 2, 2016 at 10:46 AM, Jinfeng Ni wrote:

Re: Ask for help about Drill

2016-01-30 Thread Jason Altekruse
We don't publish all of the generated sources, but they are easy to generate. Just run this from the project root, we've been having some pergem issues with maven recently so you may need to bump up the permgen if it fails. To be safe you can just run this before the install command: export

Re: DATA_WRITE ERROR: Failed to drop table

2016-01-28 Thread Jason Altekruse
need to > build the drill with some changes, I can test/provide more info if it's > needed. > > Thanks > Kumiko > > -Original Message- > From: Jason Altekruse [mailto:altekruseja...@gmail.com] > Sent: Friday, January 22, 2016 4:31 PM > To: user <user@drill.apache.org&g

Re: Set table creation format in a REST query call

2016-01-28 Thread Jason Altekruse
A few weeks ago a fix for this issue was merged, it requires using the new web UI security feature, which will hold onto a session while you are logged in. [1] You can try to build the tip of master yourself or wait for the upcoming 1.5.0 release to try it out. [1] -

Re: CTAS and storage format via REST API

2016-01-28 Thread Jason Altekruse
A few weeks ago a fix for this issue was merged, it requires using the new web UI security feature, which will hold onto a session while you are logged in. [1] You can try to build the tip of master yourself or wait for the upcoming 1.5.0 release to try it out. [1] -

Re: Proper errors should be thrown on UI while failure in adding plugin

2016-01-28 Thread Jason Altekruse
I'm actually working on fixing this right now, I'm merging a simple fix for the annoyance that your changes get erased upon failure and then I will be working on improving the message. Watch https://issues.apache.org/jira/browse/DRILL-2653 for updates. - Jason On Thu, Jan 28, 2016 at 3:36 AM,

Hangout notes from Tuesday

2016-01-28 Thread Jason Altekruse
Quick PSA for anyone who sees these messages and wants to attend. We have a weekly google hangout on Tuesdays at 10am Pacific time. There are no requirements for showing up, anyone with an interest in Drill can join. We send out a message with the link as the meeting is starting as a reminder.

Re: DATA_WRITE ERROR: Failed to drop table

2016-01-22 Thread Jason Altekruse
> io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.util.concurren

Re: DATA_WRITE ERROR: Failed to drop table

2016-01-22 Thread Jason Altekruse
on a commit near whatever release you are on. - Jason On Fri, Jan 22, 2016 at 4:19 PM, Kumiko Yada <kumiko.y...@ds-iq.com> wrote: > No, either of these were followed by a "Caused by" section with another > stacktrace. > > -Kumiko > > -Original Me

Re: DATA_WRITE ERROR: Failed to drop table

2016-01-22 Thread Jason Altekruse
d.run(Thread.java:745) [na:1.7.0_71] > 2016-01-22 23:16:45,852 [Client-1] INFO > o.a.drill.exec.rpc.user.UserClient - Channel closed /192.168.200.129:57436 > <--> /192.168.200.129:31010. > 2016-01-22 23:16:45,853 [qtp1683069650-5882] ERROR > o.a.d.e.server.rest.QueryResources - Query from

Re: CTAS plan showing single node?

2016-01-21 Thread Jason Altekruse
The query plans can indicate if a query is parallelized, by looking for exchanges, which are used to merge work from multiple execution fragments, or to re-distribute data for an operation. Execution fragments can run on different threads or different machines. The best place to find out how

Re: query plan caching?

2016-01-14 Thread Jason Altekruse
Currently not to my knowledge. Are there queries you are seeing that are taking an abnormally long time to plan? On Thu, Jan 14, 2016 at 6:25 AM, Vince Gonzalez wrote: > Does Drill do any caching of query plans? >

Re: JSON File, Total numbers Record: 1

2016-01-14 Thread Jason Altekruse
_zone_H1":"1899.0","time_in_zone_H2":"2719.0","time_in_zone_H3":"4018.0","time_in_zone_H4":"1285.0","percent_in_zone_H1":"19.14122","percent_in_zone_H2":"27.40651","p

Re: "Already had POJO for id" Error In SQL - Join B/W Redshift and HDFS

2016-01-14 Thread Jason Altekruse
Jacques, not sure if you caught this, in the stacktrace it mentions broadcast sender. Did the plan for your test query include a broadcast join? * (com.fasterxml.jackson.databind.JsonMappingException) Already had POJO for id (java.lang.Integer)

Fwd: Drill query does not return all results from HBase

2016-01-14 Thread Jason Altekruse
but I'm wrapping up some other things right now. -- Forwarded message -- From: Kumiko Yada <kumiko.y...@ds-iq.com> Date: Thu, Jan 14, 2016 at 12:53 PM Subject: RE: Drill query does not return all results from HBase To: Jason Altekruse <altekruseja...@gmail.com> Jason,

Re: JSON File, Total numbers Record: 1

2016-01-13 Thread Jason Altekruse
flat_intervals from (select >> flatten(rides) as flat_rides from `dfs`.`tmp`.`provaRider`) as t ) as tt* >> >> [image: Immagine incorporata 1] >> >> >> Could u pls suggest how to fix this drill? >> >> Best regards, >> Paolo >> >> >

Re: JSON File, Total numbers Record: 1

2016-01-13 Thread Jason Altekruse
t ) as tt > [30027]Query execution error. Details:[ > SYSTEM ERROR: DrillRuntimeException: kvgen function only supports Simple > maps as input > > Fragment 0:0 > > [Error Id: 85f8e8ba-fb87-428a-ac2e-dea78498c222 on 10.1.0.74:31010] > ] > > 2016-01-13 20:28 GMT+01:00 Jason Altekrus

Re: Drill query does not return all results from HBase

2016-01-13 Thread Jason Altekruse
Re: Drill query does not return all results from HBase > > Well, the major version din't change if I remember it right, hence did not > share the info in my previous mail. I'm on HBase 1.1.1 right now and don't > see the issue. Also, I am on a MapR setup, which might not be comparable > with t

Re: problem with drill

2016-01-13 Thread Jason Altekruse
The approval messages sent to list managers actually don't filter out images, it happens before it is sent out to the actual list. I threw them onto imgur for others to see. http://imgur.com/a/L7ksG The jdbc-all jar should contain everything needed to connect to drill. I can't think of a reason

Hangout starting now

2016-01-12 Thread Jason Altekruse
https://plus.google.com/hangouts/_/dremio.com/drillhangout

Re: Drill query does not return all results from HBase

2016-01-12 Thread Jason Altekruse
I'm not sure why this is happening, we have tests in our automated suite that I believe run some pretty large queries against Hbase and verify the results. Aditya, do you have some time available to try to reproduce this and diagnose the problem? On Wed, Jan 6, 2016 at 2:03 PM, Kumiko Yada

Re: Classpath scanning & udfs

2016-01-12 Thread Jason Altekruse
xec/java-exec/src/main/resources/drill-module.conf> - Jason On Mon, Jan 11, 2016 at 11:24 AM, rahul challapalli < challapallira...@gmail.com> wrote: > Sure! > > On Mon, Jan 11, 2016 at 11:06 AM, Jason Altekruse < > altekruseja...@gmail.com> > wrote: > > &g

Re: Drill query does not return all results from HBase

2016-01-12 Thread Jason Altekruse
; Although I could repro' the problem consistently, it was resolved once i > updated my Hadoop setup. My guess is that it was a HBase bug which got > resolved. Although strange as it seems, it might not have to do with Drill > itself. > > -Abhishek > > On Tue, Jan 12, 2016 at 7:52

Re: Null Return

2016-01-12 Thread Jason Altekruse
Nirav, Were you able to get your function working? On Wed, Jan 6, 2016 at 6:42 AM, Jason Altekruse <altekruseja...@gmail.com> wrote: > I believe you are hitting a case that Jacques was trying to describe with > his last message. > > Drill currently matches functions on b

Re: JSON File, Total numbers Record: 1

2016-01-11 Thread Jason Altekruse
Paolo, Drill currently reads single JSON objects as single records. If you look at the top of your file you can see that the root of your document is a single JSON object. Drill accepts two formats for individual records: The Mongo import format, a series of JSON object one after the other in a

Re: Classpath scanning & udfs

2016-01-11 Thread Jason Altekruse
Rahul, The error message you are seeing is in reading a storage plugin configuration file. I am planning to fix these kinds of messages to actually direct users at the file that is failing parsing. I have seen this in the past when the classpath was incorrect and one of the plugins (like Hbase)

Re: Classpath scanning & udfs

2016-01-11 Thread Jason Altekruse
Jars that don't contain a drill-module.conf will not get scanned. > > > > > > > > > > > > > > > > On Mon, Jan 11, 2016 at 10:17 AM, rahul challapalli < > > > > challapallira...@gmail.com> wrote: > > > > > > > >

Re: SQL Lookup table - how to mimic?

2016-01-06 Thread Jason Altekruse
the constraint. For your example here, you could use an inner join to find all of the records that have valid county codes, and only insert those into your table. - Jason Altekruse On Wed, Jan 6, 2016 at 7:31 AM, Christopher Matta <cma...@mapr.com> wrote: > Are you asking about a simple JOIN? &g

Re: Null Return

2016-01-06 Thread Jason Altekruse
I believe you are hitting a case that Jacques was trying to describe with his last message. Drill currently matches functions on both data type (int, varchar, boolean) and data mode (nullable or required). If you look at the error message, you can see that the data types that drill is receiving

Re: Performance of Drill SQL for Hadoop when Drill is outside Hadoop cluster

2016-01-02 Thread Jason Altekruse
Hi Shashanka, Drill does have the ability to avoid reading part of your data by using partitioning. This currently works best using partitioned parquet files. Drill includes an auto-partitioning feature available for use with the CREATE TABLE AS statement that works when outputting to the parquet

Re: Python Driver Contribution Idea

2015-12-28 Thread Jason Altekruse
One thing we should fix to make this easier is to provide properly typed data through the rest API. This result listener is transforming the native drill record format into a simple hashmap with both the keys and values provided as strings. This list of hashmaps is serialized by jackson into the

Re: Python Driver Contribution Idea

2015-12-28 Thread Jason Altekruse
There is already a JIRA opened for this issue with some work done by Adam Gilmore. Looks like we never merged it though. I'll try to take a look at this soon and get it merged. https://issues.apache.org/jira/browse/DRILL-2373 On Mon, Dec 28, 2015 at 11:03 AM, Jason Altekruse <altekrus

Re: [ANNOUNCE] Release of Apache Drill 1.4.0

2015-12-16 Thread Jason Altekruse
Hi John, Unfortunately this feature only works with the options that can be set as part of the format plugin right now. I think we should definitely move the options for interpreting files, like the JSON options you mentioned to the format plugin/select with options scope, rather than making

Re: query has shutdown

2015-12-09 Thread Jason Altekruse
Do you have a message or a stacktrace for the error that occurred? Could you try searching in the logs for "Exception" or "Error" and paste any messages you find here? Was this problem reproducible after you restarted the cluster? On Tue, Dec 8, 2015 at 6:50 PM, Deng Jie

Re: Announcing new committer: Kristine Hahn

2015-12-04 Thread Jason Altekruse
Congrats Kris! Well deserved, the docs are looking great! On Fri, Dec 4, 2015 at 9:36 AM, Sudheesh Katkam wrote: > Congratulations and welcome, Kris! > > > On Dec 4, 2015, at 9:19 AM, Jacques Nadeau wrote: > > > > The Apache Drill PMC is very pleased

Re: Drill Query Problem

2015-12-04 Thread Jason Altekruse
according to [1], there would be a warning in > drillbit.log. > > > [1] > https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/compile/JDKClassCompiler.java#L47 > > On Fri, Dec 4, 2015 at 11:11 AM, Jason Altekruse > <altekruseja

Re: CSV Reader on 1.3

2015-12-04 Thread Jason Altekruse
; CTO and Co-Founder, Dremio > > > > On Thu, Dec 3, 2015 at 5:11 PM, Julien Le Dem <jul...@dremio.com> wrote: > > > > > Thank you! > > > > > > On Thu, Dec 3, 2015 at 5:08 PM, Jason Altekruse < > > altekruseja...@gmail.com> > &g

Re: Drill Query Problem

2015-12-04 Thread Jason Altekruse
Do we have something that checks a JDK is installed when launching Drill? Based on this JIRA it looks like we will fall back to janino if we cannot find a JDK. It is possible that this option does nothing if a JDK is not available, or it may fail with an error. We should test this out in various

Re: CSV Reader on 1.3

2015-12-03 Thread Jason Altekruse
wrote: > I didn't notice select with options is already available !!! did we add it > to the documentation ? > > On Thu, Dec 3, 2015 at 12:05 PM, Jason Altekruse <altekruseja...@gmail.com > > wrote: > >> Yes! >> >> Thanks to the new feature "select wit

Re: Newbie question: Parsing error

2015-11-24 Thread Jason Altekruse
Two small points that should get this working for you. First with the parsing issue, Drill uses backticks (`) not single quotes for identifiers like table and column names. Secondly Drill doesn't support querying relative to the path you launched it from. For basic usage without any configuration

Re: Order of records read in a parquet file

2015-11-06 Thread Jason Altekruse
The changes to parquet were not supposed to be functional at all. We had been maintaining our fork of parquet-mr to have a ByteBuffer based read and write path to reduce heap memory usage. The work done was just getting these changes merged back into parquet-mr and making corresponding changes in

Re: Order of records read in a parquet file

2015-11-06 Thread Jason Altekruse
Is this a large or private parquet file? Can you share it to allow me to debug the read path for it? On Fri, Nov 6, 2015 at 3:37 PM, Jason Altekruse <altekruseja...@gmail.com> wrote: > The changes to parquet were not supposed to be functional at all. We had > been maintaining our for

Re: Parquet Files Loaded to "directory" partitions Error on Grouping by dir0

2015-11-05 Thread Jason Altekruse
It is possible to introduce a schema change when you are writing parquet from JSON files. If you read part of the JSON that only contains some fields, and later see a new field, after we have written a few batches of data to the parquet file, we will close the current file and open a new one.

Re: Drill CTAS to single file

2015-10-21 Thread Jason Altekruse
and this caused there to be multiple output files (one for each query fragment) without the sort. On Wed, Oct 21, 2015 at 10:36 AM, Jason Altekruse <altekruseja...@gmail.com> wrote: > For clarity, the only reason I said anything about a size limit on a CSV > is that it is possible that Drill may

Re: Drill CTAS to single file

2015-10-21 Thread Jason Altekruse
if that fragment of the query happened to end up with a small amount of data, i.e. very few records were hashed to that bucket in an aggregate, etc.), and may not happen at all when we are writing CSV. On Wed, Oct 21, 2015 at 10:31 AM, Jason Altekruse <altekruseja...@gmail.com> wrote: > Whe

Re: How is dir0 inferred from the directory path

2015-10-20 Thread Jason Altekruse
I can understand an argument for consistency starting at the root requested directory. However, I don't think it isn't crazy to start at the first variable directory, because anything before that is providing information back to users that they put into the query explicitly themselves. On Mon,

Re: JSON Data with dot in keyname

2015-10-12 Thread Jason Altekruse
That is correct, if this is failing in a select *, that is a bug. Can you please file a JIRA? On Mon, Oct 12, 2015 at 10:27 AM, Paul Ilechko wrote: > Well, if it's json then you do know the field names, they are right there > in the document > > On Mon, Oct 12, 2015 at

Re: Convert from Array to String

2015-10-12 Thread Jason Altekruse
We don't implement casts on array or map, but we do have a convert function that will convert a complex structure to json. You can invoke it like this: convert_from( map_or_list_column_name, 'JSON') This will return the data serialized into JSON in a varchar column. On Mon, Oct 12, 2015 at

Re: repeated_contains - intended behaviour?

2015-09-23 Thread Jason Altekruse
I think it is reasonable to consider that a bug. We should implement the function both as it works today and as you were originally expecting it. Any ideas about about a good naming scheme for the two? Unfortunately the regular contains() method does substring matching, but I think the name

Re: NullPointers in type conversions

2015-09-23 Thread Jason Altekruse
t; >>>> to an integer invalid? What's the workaround? > >> >>>> > >> >>>> Data > >> >>>> > >> >>>> row1,1,2 > >> >>>> row2,,4 > >> >>>> row3,5,6 > >&g

Re: NullPointers in type conversions

2015-09-23 Thread Jason Altekruse
atException: > > Fragment 0:0 > > [Error Id: 2a0d104c-33cd-4680-80fe-f908147b5c0a on > se-node11.se.lab:31010] (state=,code=0) > > Shouldn’t that have returned a null for row2? > ​ > > Chris Matta > cma...@mapr.com > 215-701-3146 > > On Wed, Sep 23, 2015 at 3:

Re: NullPointers in type conversions

2015-09-10 Thread Jason Altekruse
A SQL level null is different than a null at the JAVA level that would be giving this exception (we don't represent nulls with an actual null java object). There might be a way to work around it, but this is a bug in Drill. You should be able to make a cast between compatible types even if there

Re: Naming directories

2015-09-08 Thread Jason Altekruse
ira/browse/DRILL-1441 On Tue, Sep 8, 2015 at 11:22 AM, Jason Altekruse <altekruseja...@gmail.com> wrote: > One thing you can do to speed up the expression evaluation is to use this > expression instead of regex_replace. This will avoid copying each value > into a short

  1   2   >