Yes, seems like it is possible to create files with different block sizes.
We could potentially pass the configured store.parquet.block-size to the create
call.
I will try it out and see. will let you know.
Thanks,
Padma
> On Mar 22, 2017, at 4:16 PM, François Méthot
Github user jinfengni commented on a diff in the pull request:
https://github.com/apache/drill/pull/793#discussion_r107561519
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillRelMdRowCount.java
---
@@ -14,35 +14,71 @@
* WITHOUT WARRANTIES OR
[
https://issues.apache.org/jira/browse/DRILL-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rahul Challapalli resolved DRILL-5001.
--
Resolution: Not A Bug
Ok...this is not a bug. The underlying parquet data actually
Here are 2 links I could find:
http://archive.cloudera.com/cdh4/cdh/4/hadoop/api/org/apache/hadoop/fs/FileSystem.html#create(org.apache.hadoop.fs.Path,%20boolean,%20int,%20short,%20long)
GitHub user Serhii-Harnyk opened a pull request:
https://github.com/apache/drill/pull/793
DRILL-4678: Tune metadata by generating a dispatcher at runtime
Changes for rebasing to Calcite 1.4.0-drill-r20
You can merge this pull request into a Git repository by running:
$ git
I think we create one file for each parquet block.
If underlying HDFS block size is 128 MB and parquet block size is > 128MB,
it will create more blocks on HDFS.
Can you let me know what is the HDFS API that would allow you to
do otherwise ?
Thanks,
Padma
> On Mar 22, 2017, at 11:54 AM,
Hi,
Is there a way to force Drill to store CTAS generated parquet file as a
single block when using HDFS? Java HDFS API allows to do that, files could
be created with the Parquet block-size.
We are using Drill on hdfs configured with block size of 128MB. Changing
this size is not an option at
Paul Rogers created DRILL-5376:
--
Summary: Rationalize Drill's row structure for simpler code,
better performance
Key: DRILL-5376
URL: https://issues.apache.org/jira/browse/DRILL-5376
Project: Apache
GitHub user vdiravka opened a pull request:
https://github.com/apache/drill/pull/792
DRILL-4971: query encounters system error: Statement "break AndOP3" iâ¦
â¦s not enclosed by a breakable statement with label "AndOP3"
- New evaluated blocks for boolean operators should
Arina Ielchiieva created DRILL-5375:
---
Summary: Nested loop join: return correct result for left join
Key: DRILL-5375
URL: https://issues.apache.org/jira/browse/DRILL-5375
Project: Apache Drill
I'm trying to use Drill with a proprietary datasource that is very fast in
applying data joins (i.e. SQL joins) and query filters (i.e. SQL where
conditions).
To connect to that datasource, I first have to write a storage plugin, but
I'm not sure if my main goal is applicable.
May main goal is
11 matches
Mail list logo