>From Preetham Poluparthi <[email protected]>:

Preetham Poluparthi has uploaded this change for review. ( 
https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20788?usp=email )


Change subject: [ASTERIXDB-3392][EXT] Fix false warnings while querying parquet
......................................................................

[ASTERIXDB-3392][EXT] Fix false warnings while querying parquet

- user model changes: no
- storage format changes: no
- interface changes: no

Details:
When querying Parquet files, we were seeing false warning counts, which this 
patch fixes. It also corrects the Parquet file naming format. Previously, files 
were named .parquet.zstd; they are now correctly named .zstd.parquet

Ext-ref: MB-70108

Change-Id: Id2dc25a30ea1bf7012f945803befc2751f33b86a
---
M 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/HDFSDataSourceFactory.java
M 
asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/provider/ExternalWriterProvider.java
2 files changed, 7 insertions(+), 3 deletions(-)



  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb 
refs/changes/88/20788/1

diff --git 
a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/HDFSDataSourceFactory.java
 
b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/HDFSDataSourceFactory.java
index b820147..82653c2 100644
--- 
a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/HDFSDataSourceFactory.java
+++ 
b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/input/HDFSDataSourceFactory.java
@@ -321,9 +321,8 @@
             }
             restoreConfig(ctx);
             JobConf readerConf = conf;
-            if (ctx.getWarningCollector().shouldWarn()
-                    && 
configuration.get(ExternalDataConstants.KEY_INPUT_FORMAT.trim())
-                            
.equals(ExternalDataConstants.INPUT_FORMAT_PARQUET)) {
+            if 
(configuration.get(ExternalDataConstants.KEY_INPUT_FORMAT.trim())
+                    .equals(ExternalDataConstants.INPUT_FORMAT_PARQUET)) {
                 /*
                  * JobConf is used to pass warnings from the 
ParquetReadSupport to ParquetReader. As multiple
                  * partitions can issue different warnings, we might have a 
race condition on JobConf. Thus, we
diff --git 
a/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/provider/ExternalWriterProvider.java
 
b/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/provider/ExternalWriterProvider.java
index bdfffa0..763a7a1 100644
--- 
a/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/provider/ExternalWriterProvider.java
+++ 
b/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/provider/ExternalWriterProvider.java
@@ -99,6 +99,11 @@
         Map<String, String> configuration = sink.getConfiguration();
         String format = getFormat(configuration);
         String compression = getCompression(configuration);
+        if (format.equalsIgnoreCase(ExternalDataConstants.FORMAT_PARQUET)) {
+            // Parquet file extension format is like .snappy.parquet
+            return (compression.isEmpty() ? "" : compression.toLowerCase() + 
".")
+                    + ExternalDataConstants.FORMAT_PARQUET;
+        }
         return format + (compression.isEmpty() ? "" : "." + compression);
     }


--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20788?usp=email
To unsubscribe, or for help writing mail filters, visit 
https://asterix-gerrit.ics.uci.edu/settings?usp=email

Gerrit-MessageType: newchange
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Change-Id: Id2dc25a30ea1bf7012f945803befc2751f33b86a
Gerrit-Change-Number: 20788
Gerrit-PatchSet: 1
Gerrit-Owner: Preetham Poluparthi <[email protected]>

Reply via email to