[GitHub] carbondata pull request #2139: [CARBONDATA-2267] [Presto] Support Reading Ca...

2018-04-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2139


---


[GitHub] carbondata pull request #2139: [CARBONDATA-2267] [Presto] Support Reading Ca...

2018-04-11 Thread chenliang613
Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2139#discussion_r180789005
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/PrestoFilterUtil.java
 ---
@@ -69,6 +72,13 @@
 
   private static Map filterMap = new HashMap<>();
 
+  private final static String HIVE_DEFAULT_DYNAMIC_PARTITION = 
"__HIVE_DEFAULT_PARTITION__";
--- End diff --

ok


---


[GitHub] carbondata pull request #2139: [CARBONDATA-2267] [Presto] Support Reading Ca...

2018-04-11 Thread geetikagupta16
Github user geetikagupta16 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2139#discussion_r180672102
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/PrestoFilterUtil.java
 ---
@@ -69,6 +72,13 @@
 
   private static Map filterMap = new HashMap<>();
 
+  private final static String HIVE_DEFAULT_DYNAMIC_PARTITION = 
"__HIVE_DEFAULT_PARTITION__";
+  private final static String PARTITION_VALUE_WILDCARD = "";
--- End diff --

In hive PARTITION_VALUE_WILDCARD is used to select all the partitions from 
the table when we don't have any filter conditions on the partition column in 
our query, but in our code we are already quering all the partitions for the 
above case. So we can remove this part.


---


[GitHub] carbondata pull request #2139: [CARBONDATA-2267] [Presto] Support Reading Ca...

2018-04-11 Thread geetikagupta16
Github user geetikagupta16 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2139#discussion_r180649360
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/PrestoFilterUtil.java
 ---
@@ -69,6 +72,13 @@
 
   private static Map filterMap = new HashMap<>();
 
+  private final static String HIVE_DEFAULT_DYNAMIC_PARTITION = 
"__HIVE_DEFAULT_PARTITION__";
--- End diff --

HIVE_DEFAULT_DYNAMIC_PARTITION is created when the partition value for that 
column is null so the data goes to __HIVE_DEFAULT_PARTITION__ partition. This 
check is added so that we can query the data from __HIVE_DEFAULT_PARTITION__ as 
well.


---


[GitHub] carbondata pull request #2139: [CARBONDATA-2267] [Presto] Support Reading Ca...

2018-04-10 Thread chenliang613
Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2139#discussion_r180554217
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/PrestoFilterUtil.java
 ---
@@ -69,6 +72,13 @@
 
   private static Map filterMap = new HashMap<>();
 
+  private final static String HIVE_DEFAULT_DYNAMIC_PARTITION = 
"__HIVE_DEFAULT_PARTITION__";
+  private final static String PARTITION_VALUE_WILDCARD = "";
--- End diff --

Why need to add "wildcard" for partition filter?


---


[GitHub] carbondata pull request #2139: [CARBONDATA-2267] [Presto] Support Reading Ca...

2018-04-10 Thread chenliang613
Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2139#discussion_r180553823
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/PrestoFilterUtil.java
 ---
@@ -69,6 +72,13 @@
 
   private static Map filterMap = new HashMap<>();
 
+  private final static String HIVE_DEFAULT_DYNAMIC_PARTITION = 
"__HIVE_DEFAULT_PARTITION__";
--- End diff --

can you explain, why need to add "HIVE_DEFAULT_DYNAMIC_PARTITION" ? what 
purpose?
if (value == null) {
filter.add(carbonDataColumnHandle.getColumnName() + "=" + 
HIVE_DEFAULT_DYNAMIC_PARTITION);


---


[GitHub] carbondata pull request #2139: [CARBONDATA-2267] [Presto] Support Reading Ca...

2018-04-07 Thread bhavya411
Github user bhavya411 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2139#discussion_r179913482
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableReader.java
 ---
@@ -415,14 +440,56 @@ public TBase create() {
 return result;
   }
 
+  /** Returns list of partition specs to query based on the domain 
constraints
+   * @param constraints
+   * @param carbonTable
+   * @throws IOException
+   */
+  private List 
findRequiredPartitions(TupleDomain constraints, CarbonTable 
carbonTable,
+  LoadMetadataDetails[]loadMetadataDetails)  {
+Set partitionSpecs = new HashSet<>();
+List prunePartitions = new ArrayList();
+
+for (LoadMetadataDetails loadMetadataDetail : loadMetadataDetails) {
+  SegmentFileStore segmentFileStore = null;
+  try {
+segmentFileStore =
+new SegmentFileStore(carbonTable.getTablePath(), 
loadMetadataDetail.getSegmentFile());
+partitionSpecs.addAll(segmentFileStore.getPartitionSpecs());
+
+  } catch (IOException e) {
+e.printStackTrace();
+  }
+}
+
+List partitionValuesFromExpression =
+PrestoFilterUtil.getPartitionFilters(carbonTable, constraints);
+
+List partitionSpecNamesList = 
partitionSpecs.stream().map(
+PartitionSpec::getPartitions).collect(Collectors.toList());
+
+List partitionSpecsList = new ArrayList(partitionSpecs);
+
+for (int i = 0; i < partitionSpecNamesList.size(); i++) {
--- End diff --

I am not sure about this logic , you have a set and you create a 
specsNameList and then you create a list of partitionSpecs and then you check 
that if the list has all fiteredPartition you add it to prunedPartitiion, why 
not do it in the 1st map itself instead of complicating code


---


[GitHub] carbondata pull request #2139: [CARBONDATA-2267] [Presto] Support Reading Ca...

2018-04-07 Thread bhavya411
Github user bhavya411 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2139#discussion_r179913431
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableReader.java
 ---
@@ -415,14 +440,56 @@ public TBase create() {
 return result;
   }
 
+  /** Returns list of partition specs to query based on the domain 
constraints
+   * @param constraints
+   * @param carbonTable
+   * @throws IOException
+   */
+  private List 
findRequiredPartitions(TupleDomain constraints, CarbonTable 
carbonTable,
+  LoadMetadataDetails[]loadMetadataDetails)  {
+Set partitionSpecs = new HashSet<>();
+List prunePartitions = new ArrayList();
+
+for (LoadMetadataDetails loadMetadataDetail : loadMetadataDetails) {
+  SegmentFileStore segmentFileStore = null;
+  try {
+segmentFileStore =
+new SegmentFileStore(carbonTable.getTablePath(), 
loadMetadataDetail.getSegmentFile());
+partitionSpecs.addAll(segmentFileStore.getPartitionSpecs());
+
+  } catch (IOException e) {
+e.printStackTrace();
+  }
+}
+
+List partitionValuesFromExpression =
+PrestoFilterUtil.getPartitionFilters(carbonTable, constraints);
+
+List partitionSpecNamesList = 
partitionSpecs.stream().map(
+PartitionSpec::getPartitions).collect(Collectors.toList());
+
+List partitionSpecsList = new ArrayList(partitionSpecs);
+
+for (int i = 0; i < partitionSpecNamesList.size(); i++) {
+  List partitionSpecNames = partitionSpecNamesList.get(i);
+  if (partitionSpecNames.containsAll(partitionValuesFromExpression)) {
+prunePartitions
+.add(partitionSpecsList.get(i));
+  }
+}
+return prunePartitions;
+  }
+
   private CarbonTableInputFormat  createInputFormat( Configuration 
conf,
-   AbsoluteTableIdentifier identifier, Expression filterExpression)
-  throws IOException {
+  AbsoluteTableIdentifier identifier, Expression filterExpression, 
List filteredPartitions)
+  throws IOException {
 CarbonTableInputFormat format = new CarbonTableInputFormat();
 CarbonTableInputFormat.setTablePath(conf,
-identifier.appendWithLocalPrefix(identifier.getTablePath()));
+identifier.appendWithLocalPrefix(identifier.getTablePath()));
 CarbonTableInputFormat.setFilterPredicates(conf, filterExpression);
-
+if(filteredPartitions.size() != 0) {
+  CarbonTableInputFormat.setPartitionsToPrune(conf, new 
ArrayList<>(filteredPartitions));
--- End diff --

Why Create a new ArrayList here when the calling method is already 
returning a List


---


[GitHub] carbondata pull request #2139: [CARBONDATA-2267] [Presto] Support Reading Ca...

2018-04-07 Thread bhavya411
Github user bhavya411 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2139#discussion_r179913191
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableReader.java
 ---
@@ -387,11 +398,25 @@ public TBase create() {
 config.set(CarbonTableInputFormat.DATABASE_NAME, 
carbonTable.getDatabaseName());
 config.set(CarbonTableInputFormat.TABLE_NAME, 
carbonTable.getTableName());
 
+JobConf jobConf = new JobConf(config);
+List filteredPartitions = new ArrayList();
+
+try {
+  loadMetadataDetails= SegmentStatusManager
+  
.readTableStatusFile(CarbonTablePath.getTableStatusFilePath(carbonTable.getTablePath()));
+} catch (IOException e) {
+  e.printStackTrace();
--- End diff --

The Error should be logged and if there is an IOException than it needs to 
be handled or thrown to make sure that error is propogated.


---


[GitHub] carbondata pull request #2139: [CARBONDATA-2267] [Presto] Support Reading Ca...

2018-04-04 Thread anubhav100
GitHub user anubhav100 opened a pull request:

https://github.com/apache/carbondata/pull/2139

[CARBONDATA-2267] [Presto] Support Reading CarbonData Partition From Presto 
Integration

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/anubhav100/incubator-carbondata 
prestopartition

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2139.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2139


commit 9a8cf766e97882a3671cba8d566489f5918cc948
Author: anubhav100 
Date:   2018-04-04T08:18:15Z

added logic for fetching the partition columns in presto

commit 3faf8e0e32e175e89b43edaceef23da0a03927b6
Author: Geetika Gupta 
Date:   2018-04-04T08:37:36Z

Refactored CarbonTableReader to add partition spec to configuration object




---