Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2706#discussion_r218415262
--- Diff:
processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar.java
---
@@ -239,6 +239,7 @@ public void addDataToStore(CarbonRow row) throws
CarbonDataWriterException {
* @return false if any varchar column page cannot add one more
value(2MB)
*/
private boolean isVarcharColumnFull(CarbonRow row) {
+ //TODO: test and remove this as now UnsafeSortDataRows can exceed 2MB
--- End diff --
I am not sure how we come to the conclusion of 2MB. There is no guarantee
that we always sort the data to use UnsafeSortDataRows always. How about nosort
case? And how about if user wants to add 100MB varchar how to support it.
And also this is not just limited to varchar, we should consider for
complex and string columns as well here.
@ajantha-bhat Please remove that todo, But we need to refactor the code to
ensure to keep the page size within the snappy max compressed length for
complex and string datatypes as well.
---