No one has yet implemented an hbase writer in Drill. Without that, it is
not possible to write into an hbase table.
I don't know if anyone currently plans to work on this. If this something
you are interested in taking on, I can point you in the right direction.
On Wed, May 4, 2016 at 6:36 AM,
I submitted a pull request a little while ago that introduces (approximate)
median and quantile functions using the tdigest library.
https://github.com/apache/drill/pull/456
It would be great if I could get some feedback on this. Specifically, is it
ok to call these functions median and
Steven Phillips created DRILL-4566:
--
Summary: Add TDigest functions for computing median and quantile
Key: DRILL-4566
URL: https://issues.apache.org/jira/browse/DRILL-4566
Project: Apache Drill
Steven Phillips created DRILL-4562:
--
Summary: NPE when evaluating expression on nested union type
Key: DRILL-4562
URL: https://issues.apache.org/jira/browse/DRILL-4562
Project: Apache Drill
If a fragment has already begun execution and sent some data to downstream
fragments, there is no way to simply restart the failed fragment, because
we would also have to restart any downstream fragments that consumed that
output, and so on up the tree, as well as restart any leaf fragments that
We actually removed the concept of a distributed cache from drill
altogether. So currently nothing is replacing HazelCast.
The distributed cache was used for storing the initialization-data for
intermediate fragments. Only leaf node fragments were sent via the RPC
layer. But the distributed cache
I have been thinking about this for a while now, and I feel it would be a
good idea to remove the Required vector types from Drill, and only use the
Nullable version of vectors. I think this will greatly simplify the code.
It will also simplify the creation of UDFs. As is, if a function has custom
Steven Phillips created DRILL-4489:
--
Summary: Add ValueVector tests from Drill
Key: DRILL-4489
URL: https://issues.apache.org/jira/browse/DRILL-4489
Project: Apache Drill
Issue Type: Bug
DRILL-4486 is a pretty simple fix. Without it, currently some regex queries
will fail.
I think we should include it in the release.
https://github.com/apache/drill/pull/412
On Mon, Mar 7, 2016 at 2:15 PM, Jason Altekruse
wrote:
> There is a small test issue with
Steven Phillips created DRILL-4486:
--
Summary: Expression serializer incorrectly serializes escaped
characters
Key: DRILL-4486
URL: https://issues.apache.org/jira/browse/DRILL-4486
Project: Apache
Steven Phillips created DRILL-4455:
--
Summary: Depend on Apache Arrow for Vector and Memory
Key: DRILL-4455
URL: https://issues.apache.org/jira/browse/DRILL-4455
Project: Apache Drill
Issue
I don't understand why they wouldn't be allowed. They seem perfectly valid.
On Thu, Feb 11, 2016 at 9:42 AM, Abdel Hakim Deneche
wrote:
> I have the following table tpch100/lineitem that contains 97 parquet files:
>
> tpch100/lineitem/part-m-0.parquet
>
Steven Phillips created DRILL-4382:
--
Summary: Remove dependency on drill-logical from vector submodule
Key: DRILL-4382
URL: https://issues.apache.org/jira/browse/DRILL-4382
Project: Apache Drill
I just wanted to bring up an issue that I just now discovered, that has
caused me a fair amount of grief.
https://github.com/apache/drill/pull/300/commits
DRILL-4198 changes a user-facing API, and causes StoragePlugins that were
compiled against currently released versions of Drill to no longer
I merged a patch yesterday that I believe addresses that issue. Can you see
if you still hit it?
On Thu, Jan 21, 2016 at 8:39 AM, Jacques Nadeau wrote:
> Jinfeng, can you open a jira for the failing test if one isn't open?
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
I didn't see any tests running out of memory. Which tests are you seeing
with this?
On Wed, Dec 30, 2015 at 1:37 PM, Abdel Hakim Deneche
wrote:
> Steven,
>
> were you able to successfully run the regression tests on the transfer
> patch ? I just tried and saw several
Steven Phillips created DRILL-4215:
--
Summary: Transfer ownership of buffers when doing transfers
Key: DRILL-4215
URL: https://issues.apache.org/jira/browse/DRILL-4215
Project: Apache Drill
han test.
> > > > We are expecting sorted by kl2 (descending) so that non null values
> > come
> > > > up on top.
> > > > Results seems to be have nulls on top.
> > > >
> > > > ~ Amit.
> > > >
> > > > On Mon, Dec
I just did a build a linux box, and didn't see this failure. My guess is
that it fails depending on which order the files are read.
On Mon, Dec 14, 2015 at 5:38 PM, Venki Korukanti
wrote:
> Is anyone else seeing below failure on latest master? I am running it on
>
+1 (binding)
Downloaded tarballs.
Verified checksums, verified build
On Thu, Dec 10, 2015 at 2:29 PM, Jinfeng Ni wrote:
> +1 (binding)
>
> Downloaded src tarball and build from source.
> Start drillbit in standalone mode.
> Run all the queries in yelp tutorial from
-1
the bug the Aman found is serious and should be fixed. Producing batches of
size greater than 64K could lead to some wrong results.
On Mon, Dec 7, 2015 at 2:26 PM, Abdel Hakim Deneche
wrote:
> Although I got a clean run on my linux VM, I am seeing the following error
i <
> venki.koruka...@gmail.com
> >> >
> >> wrote:
> >>
> >> > For DRILL-4109 and DRILL-4125, Vicky is not available today to
> verify. If
> >> > the changes are reviewed lets merge them today. Once the branch is cut
> >> >
uld be ok if we get the fix after
> lunch. Currently the branch is going through regression testing.
>
> On Thu, Dec 3, 2015 at 9:47 AM, Steven Phillips <ste...@dremio.com> wrote:
>
> > Okay, after looking at it more closely, it looks like its an ordering
> > problem. We
Steven Phillips created DRILL-4159:
--
Summary: TestCsvHeader sometimes fails due to ordering issue
Key: DRILL-4159
URL: https://issues.apache.org/jira/browse/DRILL-4159
Project: Apache Drill
@gmail.com> wrote:
> > I run twice and hit the same error.
> >
> >
> > On Thu, Dec 3, 2015 at 12:10 AM, Steven Phillips <ste...@dremio.com>
> wrote:
> >> I just ran the tests on a linux machine, and did not see this failure.
> Do
> >> you see
Okay, I'm going to go ahead and merge DRILL-4145
On Wed, Dec 2, 2015 at 2:45 PM, Venki Korukanti
wrote:
> For DRILL-4145: ran the regression suite which includes customer and
> extended tests. No regressions found.
>
> On Wed, Dec 2, 2015 at 1:41 PM, Venki Korukanti
Sure, I'll see if I can merge 4108, and look into 4145.
On Tue, Dec 1, 2015 at 10:10 PM, Jacques Nadeau wrote:
> It seems like 4108 and 4145 should also be addressed. Steven, can you take
> a look at trying to get these merged/resolved? (4145 might be related to
> 4108 or
I think it is because we can't actually properly account for sliced
buffers. I don't remember for sure, but I think it might be because calling
buf.capacity() on a sliced buffer returns the the capacity of root buffer,
not the size of the slice. That may not be correct, but I think it was
I actually see it when running without tests as well.
On Fri, Nov 13, 2015 at 10:55 AM, Hsuan Yi Chu wrote:
> Not bad feature, which gives the visualization of unit test completion.
>
> On Fri, Nov 13, 2015 at 10:27 AM, Parth Chandra wrote:
>
> > Yes I
Steven Phillips created DRILL-4081:
--
Summary: Handle schema changes in ExternalSort
Key: DRILL-4081
URL: https://issues.apache.org/jira/browse/DRILL-4081
Project: Apache Drill
Issue Type
Does DRILL-4070 cause incorrect results? Or just prevent partition pruning?
On Thu, Nov 12, 2015 at 10:32 AM, Jason Altekruse
wrote:
> I just commented on the JIRA, we are behaving correctly for newly created
> parquet files. I did confirm the failure to prune on
+1 on merging this soon.
Going forward, I agree it makes sense to break the RPC module into a
stand-alone module that is not specific to drill. But whether it is better
for it live in the Drill project or in the new Vector project, I am not
sure.
On Sun, Nov 8, 2015 at 6:42 PM, Jacques Nadeau
My view on storing it in some other format is that, yes, it will probably
reduce the size of the file, but if we gzip the json file, it should be
pretty compact. As for deserialization cost, other formats would be faster,
but not dramatically faster. Certainly not the order of magnitude faster
; > Consider the following 2 queries and their total elapsed times
> against
> > a
> > > table with 31 files:
> > > (A) SELECT count(*) FROM table WHERE `date` = '2015-07-01';
> > > elapsed time: 980 secs
> > >
> > >
I believe kill() will only stop the upstream fragments from sending
batches, but it does nothing about the batches that have already been sent.
When kill() is called on the RawBatchBuffer, this will release all of the
batches in the queue. But I believe it is still necessary to wait for all
I think we need to come up with a way to push partition pruning to
execution time. The other solutions may relive the problem in some cases,
but won't solve the fundamental problem.
For example, even if we do figure out how to use multiple threads for
reading the metadata, that may be fine for a
I would think if we want to expose the timestamp field, we should add
another layer of nesting. In other words, every qualifier, which is
currently a single value, would actually be a map, which includes a value
field and timestamp field. Of course, we could also take it a step further
and expose
In the work I did for the Union types, (see PR
https://github.com/apache/drill/pull/207), I actually went down that exact
path. In that branch, if Union type is enable, any vectors created through
the ComplexWriter interface will not create any Repeated type vectors.
On Mon, Oct 19, 2015 at 2:29
I personally think the current usage of flatten is very unintuitive and
confusing, and I think the BigQuery usage is much better. If I were
designing this function from scratch, I would not allow using flatten in
the select cause and only allow it as a table function.
For example, take this
Steven Phillips created DRILL-3912:
--
Summary: Common subexpression elimination
Key: DRILL-3912
URL: https://issues.apache.org/jira/browse/DRILL-3912
Project: Apache Drill
Issue Type: Bug
Steven Phillips created DRILL-3909:
--
Summary: Decimal round functions corrupts input data
Key: DRILL-3909
URL: https://issues.apache.org/jira/browse/DRILL-3909
Project: Apache Drill
Issue
I think we should do a new candidate. We have two fixes that seem somewhat
important.
On Wed, Oct 7, 2015 at 10:37 AM, Abdel Hakim Deneche
wrote:
> the only way to include any new fixes into 1.2.0 is to sink the current
> release candidate and start another one.
>
> is
There is also the jdbc storage issue, which Andrew says he has a fix for.
It's just a packaging problem, but given that it's one of the main features
of this release, I think it's important to get in.
On Wed, Oct 7, 2015 at 10:39 AM, Abdel Hakim Deneche
wrote:
> sorry, I
; On Wed, Oct 7, 2015 at 10:41 AM, Steven Phillips <ste...@dremio.com>
> wrote:
>
> > There is also the jdbc storage issue, which Andrew says he has a fix for.
> > It's just a packaging problem, but given that it's one of the main
> features
> > of this release, I think
That bug only occurs when the selection is a path to a single file, and
that file is single-valued on the column in the where clause.
The more common use case of querying a directory which contains parquet
files that are each single-valued on a date column does not have this
problem.
Are you
In addition, your UDF needs to have the attribute "nulls =
NullHandling.INTERNAL"
On Tue, Oct 6, 2015 at 8:32 AM, Abdel Hakim Deneche
wrote:
> Hi Tug,
>
> Let's say your UDF returns an int, your @output field will be defined like
> this:
>
> @Output NullableIntHolder out;
[
https://issues.apache.org/jira/browse/DRILL-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steven Phillips resolved DRILL-3887.
Resolution: Fixed
> Parquet metadata cache not being u
I just pushed my fix for DRILL-3887.
On Fri, Oct 2, 2015 at 5:42 PM, Jason Altekruse
wrote:
> Hey Hakim,
>
> I have been having trouble with the unit tests on my machine today. The
> unit tests passed earlier, but I'm just trying to get a clean run with the
> patch
Steven Phillips created DRILL-3887:
--
Summary: Parquet metadata cache not being used
Key: DRILL-3887
URL: https://issues.apache.org/jira/browse/DRILL-3887
Project: Apache Drill
Issue Type
It looks like the comments for that function are not correct. If you look
at the javadoc for the toBinaryString() method which gets called, you will
get the complete story.
In short, it prints the bytes that are printable, and prints a hex
representation for bytes that are not printable. This is
I think it probably isn't needed anymore. O believe it is a holdover from
before spilling was implemented. It doesn't seem to serve any purpose now.
On Thu, Aug 27, 2015 at 9:17 AM, Abdel Hakim Deneche adene...@maprtech.com
wrote:
anyone ?
On Tue, Aug 25, 2015 at 2:56 PM, Abdel Hakim Deneche
It would be helpful if you could figure out what the file count is. But
here are some thoughs:
What is the value of the option:
store.partition.hash_distribute
If it is false, which it is by default, then every fragment will
potentially have data in every partition. In this case, that could
One possible exception to the access pattern occurs when vectors wrap other
vectors. Specifically, the offset vectors in Variable Length and Repeated
vectors. These vectors are accessed and mutated multiple times. If we are
going to implement strict enforcement, we need to consider that case.
On
:).
Regards,
-Stefan
On Wed, Aug 26, 2015 at 7:30 PM, Steven Phillips
s...@apache.org
wrote:
It would be helpful if you could figure out what the file count
is.
But
here are some thoughs:
What is the value of the option
The general pattern we have adopted in the Drill community is to pattern
the commit message like this:
DRILL-jira number: Description of what was fixed
As long as you follow that pattern, I don't think there are really any
other expectations for making the pull request.
On Sat, Aug 22, 2015 at
.
- Rahul
--
Steven Phillips
Software Engineer
mapr.com
Steven Phillips created DRILL-3487:
--
Summary: MaterializedField equality doesn't check if nested fields
are equal
Key: DRILL-3487
URL: https://issues.apache.org/jira/browse/DRILL-3487
Project
(line 339)
https://reviews.apache.org/r/34374/#comment144025
It probably makes sense to release the batch there, but it's not necessary
because the RecordBatchLoader releases the buffers when it loads the new ones,
or when close() is called. So there is no memory leak here.
- Steven Phillips
Steven Phillips created DRILL-3477:
--
Summary: Using IntVector for null expressions causes problems with
implicit cast
Key: DRILL-3477
URL: https://issues.apache.org/jira/browse/DRILL-3477
Project
/
Testing
---
Thanks,
Steven Phillips
Diff: https://reviews.apache.org/r/36229/diff/
Testing
---
Thanks,
Steven Phillips
/apache/drill/exec/store/text/TestNewTextReader.java
76674f97f92ddc3e26e9a3789212c1b7708ec770
exec/java-exec/src/test/resources/textinput/input3.tsv PRE-CREATION
Diff: https://reviews.apache.org/r/36222/diff/
Testing
---
Thanks,
Steven Phillips
Diff: https://reviews.apache.org/r/36233/diff/
Testing
---
Thanks,
Steven Phillips
[
https://issues.apache.org/jira/browse/DRILL-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steven Phillips resolved DRILL-1816.
Resolution: Cannot Reproduce
Scan Error with JSON on large no of records with Complex
[
https://issues.apache.org/jira/browse/DRILL-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steven Phillips resolved DRILL-2760.
Resolution: Fixed
Fixed before 48d8a59
Quoted strings from CSV file appear in query
it is
probably
about
time for a 1.1 release. Shall we branch in the next day or two
and
put
1.1
to a vote?
Jacques
--
Steven Phillips
Software Engineer
mapr.com
/TestCTASPartitionFilter.java
48d7cebb26d2bf08baff39d6232e4829bd98d648
Diff: https://reviews.apache.org/r/36019/diff/
Testing
---
Thanks,
Steven Phillips
[
https://issues.apache.org/jira/browse/DRILL-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steven Phillips resolved DRILL-3410.
Resolution: Fixed
Fixed by c1998605dc2acdc5fa55792a279a473ff890a010
Partition Pruning
48d7cebb26d2bf08baff39d6232e4829bd98d648
Diff: https://reviews.apache.org/r/36019/diff/
Testing
---
Thanks,
Steven Phillips
://reviews.apache.org/r/35973/diff/
Testing
---
Thanks,
Steven Phillips
---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35973/#review89636
---
On June 27, 2015, 6:30 p.m., Steven Phillips wrote
[
https://issues.apache.org/jira/browse/DRILL-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steven Phillips resolved DRILL-3376.
Resolution: Fixed
Fixed by 5f0e4cbd0f49600c41abf38056bcd29849c5cdf9
Reading individual
Steven Phillips created DRILL-3402:
--
Summary: Throw exception when attempting to partition for format
that don't support
Key: DRILL-3402
URL: https://issues.apache.org/jira/browse/DRILL-3402
Project
/main/java/org/apache/drill/exec/store/parquet/ParquetFormatPlugin.java
eff78724c6edfd4a7bffd8e78bf9cf1022e8ce75
Diff: https://reviews.apache.org/r/35941/diff/
Testing
---
Thanks,
Steven Phillips
---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35960/#review89600
---
Ship it!
Ship It!
- Steven Phillips
On June 27, 2015, 1:12 a.m
Why BatchSchema.getColumn(int index) accept out-of-range values of
index?
Thanks,
Daniel
--
Daniel Barclay
MapR Technologies
--
Steven Phillips
Software Engineer
mapr.com
Steven Phillips created DRILL-3366:
--
Summary: Short circuit of OR expression causes incorrect
partitioning
Key: DRILL-3366
URL: https://issues.apache.org/jira/browse/DRILL-3366
Project: Apache Drill
---
On June 22, 2015, 10:22 p.m., Steven Phillips wrote:
---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35739
---
Thanks,
Steven Phillips
/ParquetRecordWriter.java
621f05c4d50ecf83071a5df414be88e7471f0490
exec/java-exec/src/main/java/org/apache/drill/exec/store/text/DrillTextRecordWriter.java
31b1fbe9e03282161ee125cb7a4b2f53c8a8da63
Diff: https://reviews.apache.org/r/35739/diff/
Testing
---
Thanks,
Steven Phillips
prevent
JIRAs needing long lists of patches with names like
DRILL-3000-part1-version3.patch
--
Steven Phillips
Software Engineer
mapr.com
--
Steven Phillips
Software Engineer
mapr.com
://www.mapr.com/training?utm_source=Emailutm_medium=Signatureutm_campaign=Free%20available
--
Steven Phillips
Software Engineer
mapr.com
/parser/SqlCreateTable.java
https://reviews.apache.org/r/35026/#comment138544
Should we maybe use PARTITIONED BY instead, to match Hive's syntax?
- Steven Phillips
On June 3, 2015, 9:14 p.m., Jinfeng Ni wrote
?
--
Steven Phillips
Software Engineer
mapr.com
[
https://issues.apache.org/jira/browse/DRILL-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steven Phillips resolved DRILL-3100.
Resolution: Fixed
Resolved in d8b1975
TestImpersonationDisabledWithMiniDFS fails
[
https://issues.apache.org/jira/browse/DRILL-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steven Phillips resolved DRILL-3098.
Resolution: Fixed
Resolved in 984ee01
Set Unix style line.separator for tests
[
https://issues.apache.org/jira/browse/DRILL-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steven Phillips resolved DRILL-3093.
Resolution: Fixed
fixed in 7f575df
Leaking RawBatchBuffer
[
https://issues.apache.org/jira/browse/DRILL-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steven Phillips resolved DRILL-3051.
Resolution: Pending Closed
Fixed in 83d8ebe
Integer overflow in TimedRunnable
[
https://issues.apache.org/jira/browse/DRILL-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steven Phillips resolved DRILL-3050.
Resolution: Pending Closed
Fixed in b3d097b
Increase query context max memory
[
https://issues.apache.org/jira/browse/DRILL-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steven Phillips resolved DRILL-3049.
Resolution: Pending Closed
01a36f1
Increase sort spooling threshold
[
https://issues.apache.org/jira/browse/DRILL-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steven Phillips resolved DRILL-3051.
Resolution: Fixed
Integer overflow in TimedRunnable
[
https://issues.apache.org/jira/browse/DRILL-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steven Phillips resolved DRILL-3050.
Resolution: Fixed
Increase query context max memory
---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34234/#review83842
---
Ship it!
Ship It!
- Steven Phillips
On May 14, 2015, 9:06 p.m
Diff: https://reviews.apache.org/r/34239/diff/
Testing
---
Thanks,
Steven Phillips
---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34184/#review83877
---
Ship it!
Ship It!
- Steven Phillips
On May 14, 2015, 11:14 p.m
---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34184/#review83888
---
Ship it!
Ship It!
- Steven Phillips
On May 14, 2015, 11:14 p.m
/apache/drill/exec/work/batch/TestUnlimitedBatchBuffer.java
b8336e9779d45251081128ff1450adc4a2f38576
Diff: https://reviews.apache.org/r/34037/diff/
Testing
---
Thanks,
Steven Phillips
Steven Phillips created DRILL-3049:
--
Summary: Increase sort spooling threshold
Key: DRILL-3049
URL: https://issues.apache.org/jira/browse/DRILL-3049
Project: Apache Drill
Issue Type: Bug
Steven Phillips created DRILL-3048:
--
Summary: Disable assertions by default
Key: DRILL-3048
URL: https://issues.apache.org/jira/browse/DRILL-3048
Project: Apache Drill
Issue Type: Bug
1 - 100 of 154 matches
Mail list logo