[jira] [Commented] (DRILL-7512) Parquet Reading or Writing does not work with ADLS Gen 2

RONIERI MARQUES RAMALHO (Jira) Fri, 21 Feb 2020 16:23:22 -0800


    [ 
https://issues.apache.org/jira/browse/DRILL-7512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17042285#comment-17042285
 ]


RONIERI MARQUES RAMALHO commented on DRILL-7512:
------------------------------------------------

Don't know if it helps but, I have a very similar problem using just Azure apps 
(ADF and Azcopy) to read from Azure Data Lake Storage Gen 2. So I don't think 
Apache Drill is the one to blame

I opened a case on Microsoft, and I'm waiting for a response, until now we 
believe there's something corrupted on the data files.

The workarround was, to read those files using Databricks pyspark dataframes, 
and then rewriting data on the Data Lake again

But as this "corruption" happened twice in a 4 months interval, I'll wait for 
the investigation to find the cause, in hopes to avoid it

 

Regards

> Parquet Reading or Writing does not work with ADLS Gen 2
> --------------------------------------------------------
>
>                 Key: DRILL-7512
>                 URL: https://issues.apache.org/jira/browse/DRILL-7512
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>    Affects Versions: 1.16.0, 1.17.0
>            Reporter: Greg Shomette
>            Priority: Minor
>
> I can query delimited files in ADLS Gen 2 using wasb blob storage plugin, I 
> can show files and see parquet files but cannot read them or write them.
> I can use a DFS plugin to read and write parquet locally but not with Gen 2. 
> ADLS Gen 1 works fine for reading and writing.
> I have tried two version of Drill and also the recommended jar files as well 
> as older versions and still no luck.
> Does Drill support data lake gen 2 with parquet files. 
> This query creates the following error.
> | |CREATE TABLE az.tmp.sampleparquet AS (SELECT * FROM 
> az.`/Conformed/DimGeo.psv`)|
> {color:#333333} (java.lang.RuntimeException) java.lang.NoSuchMethodError: 
> com.microsoft.azure.storage.blob.CloudBlob.startCopyFromBlob(Ljava/net/URI;Lcom/microsoft/azure/storage/AccessCondition;Lcom/microsoft/azure/storage/AccessCondition;Lcom/microsoft/azure/storage/blob/BlobRequestOptions;Lcom/microsoft/azure/storage/OperationContext;)Ljava/lang/String;
>     org.apache.drill.common.DeferredException.addThrowable():101
>     org.apache.drill.exec.work.fragment.FragmentExecutor.fail():475
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():317
>     org.apache.drill.common.SelfCleaningRunnable.run():38{color}
>  
> {color:#333333}this query create the following error.{color}
> {color:#333333}{color:#001000}select * from az.`region.parquet`{color}{color}
> {color:#333333}SYSTEM ERROR: StorageException: The requested operation is not 
> allowed in the current state of the entity.
> Please, refer to logs for more information.
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: Error while applying rule DrillScanRule, args 
> [rel#44:EnumerableTableScan.ENUMERABLE.ANY([]).[](table=[az, region.parquet])]
>     org.apache.drill.exec.work.foreman.Foreman.run():302
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1149
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():624
>     java.lang.Thread.run():748{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (DRILL-7512) Parquet Reading or Writing does not work with ADLS Gen 2

Reply via email to