[GitHub] carbondata pull request #1834: [CARBONDATA-2056] Hadoop Configuration with a...

gvramana Tue, 30 Jan 2018 07:36:01 -0800

Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1834#discussion_r164779536
  
    --- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java
 ---
    @@ -299,11 +299,11 @@ public boolean delete() {
       }
     
       @Override public DataInputStream getDataInputStream(String path, 
FileFactory.FileType fileType,
    -      int bufferSize, String compressor) throws IOException {
    +      int bufferSize, Configuration configuration, String compressor) 
throws IOException {
    --- End diff --
    
    Only handling in case of getDataInputStream is not sufficient.
    1) All the file operations should use configuration passed through 
constructor.
    2) All connecting flows from spark to carbondata file operations should 
pass the hadoop configurations. Ex: InputFormats and OutputFormats should 
comply to configuration being passed.
    3) RDDs involving fileoperations, like DataloadRdd, MergeRdd, ScanRdd 
should pass conf to executor and pass to file operations. Ex: refer 
NewHadoopRDD of spark which broadcasts conf.

---

[GitHub] carbondata pull request #1834: [CARBONDATA-2056] Hadoop Configuration with a...

Reply via email to