[GitHub] carbondata pull request #2895: [HOTFIX] Fix NPE in spark, when same vector r...

2018-11-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2895


---


[GitHub] carbondata pull request #2895: [HOTFIX] Fix NPE in spark, when same vector r...

2018-11-05 Thread ajantha-bhat
Github user ajantha-bhat commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2895#discussion_r231001948
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
 ---
@@ -61,10 +61,7 @@ public void fillVector(int[] invertedIndex, int[] 
invertedIndexReverse, byte[] d
 int columnValueSize = dimensionDataChunkStore.getColumnValueSize();
 int rowsNum = data.length / columnValueSize;
 CarbonColumnVector vector = vectorInfo.vector;
-if (!dictionary.isDictionaryUsed()) {
-  vector.setDictionary(dictionary);
-  dictionary.setDictionaryUsed();
-}
+vector.setDictionary(dictionary);
--- End diff --

done


---


[GitHub] carbondata pull request #2895: [HOTFIX] Fix NPE in spark, when same vector r...

2018-11-05 Thread manishgupta88
Github user manishgupta88 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2895#discussion_r230999455
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
 ---
@@ -61,10 +61,7 @@ public void fillVector(int[] invertedIndex, int[] 
invertedIndexReverse, byte[] d
 int columnValueSize = dimensionDataChunkStore.getColumnValueSize();
 int rowsNum = data.length / columnValueSize;
 CarbonColumnVector vector = vectorInfo.vector;
-if (!dictionary.isDictionaryUsed()) {
-  vector.setDictionary(dictionary);
-  dictionary.setDictionaryUsed();
-}
+vector.setDictionary(dictionary);
--- End diff --

Both the method are called in carbon flow for vector filling. One is direct 
fill case and the other one is old vector fill flow. Please cross check once


---


[GitHub] carbondata pull request #2895: [HOTFIX] Fix NPE in spark, when same vector r...

2018-11-05 Thread ajantha-bhat
Github user ajantha-bhat commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2895#discussion_r230997947
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
 ---
@@ -61,10 +61,7 @@ public void fillVector(int[] invertedIndex, int[] 
invertedIndexReverse, byte[] d
 int columnValueSize = dimensionDataChunkStore.getColumnValueSize();
 int rowsNum = data.length / columnValueSize;
 CarbonColumnVector vector = vectorInfo.vector;
-if (!dictionary.isDictionaryUsed()) {
-  vector.setDictionary(dictionary);
-  dictionary.setDictionaryUsed();
-}
+vector.setDictionary(dictionary);
--- End diff --

I have checked this while coding, fill row is our method. So, no issues. 
Only this vector is spark vector and dictionary needs to clear for it.


---


[GitHub] carbondata pull request #2895: [HOTFIX] Fix NPE in spark, when same vector r...

2018-11-05 Thread manishgupta88
Github user manishgupta88 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2895#discussion_r230814030
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/LocalDictDimensionDataChunkStore.java
 ---
@@ -61,10 +61,7 @@ public void fillVector(int[] invertedIndex, int[] 
invertedIndexReverse, byte[] d
 int columnValueSize = dimensionDataChunkStore.getColumnValueSize();
 int rowsNum = data.length / columnValueSize;
 CarbonColumnVector vector = vectorInfo.vector;
-if (!dictionary.isDictionaryUsed()) {
-  vector.setDictionary(dictionary);
-  dictionary.setDictionaryUsed();
-}
+vector.setDictionary(dictionary);
--- End diff --

Is the same handling required in `fillRow` method also in the same 
class?..If required then `isDictionaryUsed` and `setDictionaryUsed` API's will 
not be required and those can also be removed from the interface


---


[GitHub] carbondata pull request #2895: [HOTFIX] Fix NPE in spark, when same vector r...

2018-11-05 Thread ajantha-bhat
GitHub user ajantha-bhat opened a pull request:

https://github.com/apache/carbondata/pull/2895

[HOTFIX] Fix NPE in spark, when same vector reads files with local 
dictionary and without local dictionary

problem: NPE in spark, when same vector reads files with local dictionary 
and without local dictionary

cause: when two carbondata files are present, one with local dictionary and 
one without local dictionary. If same vector is used to read this files [can 
happen if task is launched to group of files]. If  local dictionary files are 
found first, dictionary is set for that vector. But it was never reset for 
another file reading.

solution: reset dictionary once batch is processed,set only for local 
dictionary batch processing. 


Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed? NA
 
 - [ ] Any backward compatibility impacted? NA
 
 - [ ] Document update required? NA

 - [ ] Testing done
yes, cluster testing done.   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.  NA



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ajantha-bhat/carbondata master_new

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2895.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2895


commit 99c7621336e3cf180bfa0c3a326a2f1fafe51631
Author: ajantha-bhat 
Date:   2018-11-05T10:00:27Z

Fix vectcor reading with local dictionary and without local dictionary




---