[GitHub] carbondata pull request #2779: [WIP] Upgrade spark integration version to 2....

2018-09-27 Thread zzcclp
Github user zzcclp commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2779#discussion_r221141998
  
--- Diff: 
integration/spark2/src/main/spark2.3/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala
 ---
@@ -0,0 +1,55 @@
+/*
--- End diff --

move original class 'CarbonDataSourceScan' to src path 'commonTo2.1And2.2', 
and add a new class 'CarbonDataSourceScan' in src path 'spark2.3' which is 
added some lazy parameters.


---


[GitHub] carbondata pull request #2779: [WIP] Upgrade spark integration version to 2....

2018-09-27 Thread chenliang613
Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2779#discussion_r221131678
  
--- Diff: 
integration/spark2/src/main/spark2.3/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala
 ---
@@ -0,0 +1,55 @@
+/*
--- End diff --

Why need to move CarbonDataSourceScan.scala?


---


[GitHub] carbondata pull request #2779: [WIP] Upgrade spark integration version to 2....

2018-09-27 Thread zzcclp
Github user zzcclp commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2779#discussion_r221014706
  
--- Diff: 
integration/spark2/src/main/spark2.3/org/apache/spark/sql/execution/strategy/CarbonDataSourceScan.scala
 ---
@@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.execution.strategy
+
+import org.apache.spark.rdd.RDD
+import org.apache.spark.sql.catalyst.{InternalRow, TableIdentifier}
+import org.apache.spark.sql.catalyst.expressions.{Attribute, SortOrder}
+import org.apache.spark.sql.catalyst.plans.physical.Partitioning
+import org.apache.spark.sql.execution.FileSourceScanExec
+import org.apache.spark.sql.execution.datasources.{HadoopFsRelation, 
LogicalRelation}
+
+/**
+ *  Physical plan node for scanning data. It is applied for both tables
+ *  USING carbondata and STORED AS CARBONDATA.
+ */
+class CarbonDataSourceScan(
+override val output: Seq[Attribute],
+val rdd: RDD[InternalRow],
+@transient override val relation: HadoopFsRelation,
+val partitioning: Partitioning,
+val md: Map[String, String],
+identifier: Option[TableIdentifier],
+@transient private val logicalRelation: LogicalRelation)
+  extends FileSourceScanExec(
+relation,
+output,
+relation.dataSchema,
+Seq.empty,
+Seq.empty,
+identifier) {
+
+  override lazy val supportsBatch: Boolean = true
+
+  override lazy val (outputPartitioning, outputOrdering): (Partitioning, 
Seq[SortOrder]) =
+(partitioning, Nil)
+
+  override lazy val metadata: Map[String, String] = md
--- End diff --

The parameters (supportsBatch, outputPartitioning, outputOrdering, 
metadata) had been added keyword 'lazy', please see: 
[SPARK-PR#21815](https://github.com/apache/spark/pull/21815)


---


[GitHub] carbondata pull request #2779: [WIP] Upgrade spark integration version to 2....

2018-09-27 Thread zzcclp
Github user zzcclp commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2779#discussion_r221013145
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/spark/util/CarbonReflectionUtils.scala
 ---
@@ -296,7 +296,7 @@ object CarbonReflectionUtils {
   classOf[LogicalPlan],
   classOf[Seq[Attribute]],
   classOf[SparkPlan])
-  method.invoke(dataSourceObj, mode, query, query.output, physicalPlan)
+  method.invoke(dataSourceObj, mode, query, query.output.map(_.name), 
physicalPlan)
--- End diff --

The parameters of 'writeAndRead' method had been changed, please see: 
[SPARK-PR#22346](https://github.com/apache/spark/pull/22346)


---


[GitHub] carbondata pull request #2779: [WIP] Upgrade spark integration version to 2....

2018-09-27 Thread zzcclp
GitHub user zzcclp opened a pull request:

https://github.com/apache/carbondata/pull/2779

[WIP] Upgrade spark integration version to 2.3.2

Upgrade spark integration version to 2.3.2

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zzcclp/carbondata wip_upgrade_to_spark2.3.2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2779.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2779


commit 586cf7b6a23fa5b110f3490f8123d1b15b30e4bc
Author: Zhang Zhichao <441586683@...>
Date:   2018-09-27T17:30:34Z

[WIP] Upgrade spark integration version to 2.3.2

Upgrade spark integration version to 2.3.2




---