[GitHub] incubator-carbondata issue #751: [CARBONDATA-816] Added Example for Hive Int...

2017-04-10 Thread cenyuhai
Github user cenyuhai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/751
  
I test this example in my computer, it's very good. But I think you should 
remove some unnecessary dependencies. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #751: [CARBONDATA-816] Added Example for H...

2017-04-10 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/751#discussion_r110684928
  
--- Diff: integration/hive/pom.xml ---
@@ -64,6 +64,79 @@
 compile
 
 
+org.apache.spark
+spark-network-common_2.10
+2.1.0
+
+
+org.apache.hadoop
+hadoop-hdfs
+2.7.3
+
+
+ch.qos.logback
+logback-classic
+
+
+javax.servlet
+servlet-api
+
+
+
+
+org.apache.zookeeper
+zookeeper
+3.4.7
+
+
+jline
+jline
+
+
+
+
+org.apache.carbondata
+carbondata-spark2
+${project.version}
+
+
+org.apache.spark
+spark-sql_${scala.binary.version}
+
+
+org.apache.spark
+
spark-hive-thriftserver_${scala.binary.version}
+
+
+org.apache.spark
+spark-repl_${scala.binary.version}
+
+
+org.apache.hadoop
+hadoop-common
+2.7.3
+
+
+org.apache.httpcomponents
+httpclient
+4.3.4
+
+
+org.apache.httpcomponents
+httpcore
+4.3-alpha1
+
+
+org.apache.hadoop
--- End diff --

hadoop-client already contains hadoop-hdfs and hadoop-common, what hadoop 
version do you want?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #751: [CARBONDATA-816] Added Example for Hive Int...

2017-04-10 Thread cenyuhai
Github user cenyuhai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/751
  
@anubhav100 can you create a new project under examples and move these 
codes?
examples
--flink
--spark
--spark2
--hive
-your codes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #740: add hive integration for carbon

2017-04-05 Thread cenyuhai
Github user cenyuhai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/740
  
@QiangCai @chenliang613 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #740: add hive integration for carbon

2017-04-05 Thread cenyuhai
GitHub user cenyuhai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/740

add hive integration for carbon

add basic hive integration for carbon


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cenyuhai/incubator-carbondata CARBONDATA-727

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/740.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #740


commit 76fbfa8a09296193d1946646b346eb1f2a358bdb
Author: cenyuhai <cenyu...@didichuxing.com>
Date:   2017-03-12T15:17:40Z

add hive integration for carbon

add hive integration to assembly

alter CarbonInputFormat to implement mapred.InputFormat

add a hive serde for carbon

add hive integration to assembly

fix error in getQueryModel

add debug info

add debug info

add debug info

add debug info

fix error in CarbonRecordReader

use ArrayWritable for CarbonRecordReader

fix error in initializing CarbonRecordReader

fix error in initializing CarbonRecordReader

fix error in initializing CarbonRecordReader

fix error in initializing CarbonRecordReader

修改InputFormat的返回值

把需要查的列设置到carbon里去

fix nullpoint exception

add catalyst depedency

add catalyst depedency

add catalyst depedency

fix error in intializing carbon error

add a new hive carbon recordreader

添加把object序列化成ArrayWritable的代码

short/int等数据类型在Carbon当中实际上是Long类型

use right inspector

use right inspector

fix long can't cast int error

fix decimal cast error

column size is not equal to column type

column size is not equal to column type

column size is not equal to column type

column size is not equal to column type

fix ObjInspector error

fix ObjInspector error

fix ObjInspector error

add a new hive input split

should not combine path

add support for timestamp

clean codes

remove unused codes

support Date and TimeStamp type

add basic hive integration

alter code style

alter code style

alter code style

alter code style

change create table statement

alter CarbonSerde test case

alter CarbonSerde test case

add carbondata-hive to test classpath

add carbondata-hive to test classpath

use hive compatible schema

exclude kryo

exclude kryo

make a new profile for hive 1.2.1

remove carbon-hive from parent and assembly pom

use groupId to apache hive in pom.xml

remote hadoop-yarn-api, but HadoopFileExample will throw exception when 
debugging in IDEA

change profile name

add quick start guide for basic hive integration module

add private for properties

add some params for hive to read subdirectories recursively




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-04-05 Thread cenyuhai
Github user cenyuhai closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/672


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-25 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r108030714
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java
 ---
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+
+import java.io.IOException;
+import java.util.Properties;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.exec.FileSinkOperator;
+import org.apache.hadoop.hive.ql.io.HiveOutputFormat;
+import org.apache.hadoop.io.Writable;
+import org.apache.hadoop.mapred.FileOutputFormat;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.RecordWriter;
+import org.apache.hadoop.util.Progressable;
+
+
+public class MapredCarbonOutputFormat extends FileOutputFormat<Void, T>
--- End diff --

MapredCarbonOutputFormat also needs to implements HiveOutputFormat


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-25 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r108030455
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
 ---
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+import java.io.IOException;
+import java.util.List;
+
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
+import org.apache.carbondata.core.scan.model.CarbonQueryPlan;
+import org.apache.carbondata.core.scan.model.QueryModel;
+import org.apache.carbondata.hadoop.CarbonInputFormat;
+import org.apache.carbondata.hadoop.CarbonInputSplit;
+import org.apache.carbondata.hadoop.readsupport.CarbonReadSupport;
+import org.apache.carbondata.hadoop.util.CarbonInputFormatUtil;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
+import org.apache.hadoop.io.ArrayWritable;
+import org.apache.hadoop.mapred.InputFormat;
+import org.apache.hadoop.mapred.InputSplit;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.RecordReader;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.hadoop.mapreduce.Job;
+
+
+public class MapredCarbonInputFormat extends 
CarbonInputFormat
+implements InputFormat<Void, ArrayWritable>, 
CombineHiveInputFormat.AvoidSplitCombination {
+
+  @Override
+  public InputSplit[] getSplits(JobConf jobConf, int numSplits) throws 
IOException {
+org.apache.hadoop.mapreduce.JobContext jobContext = 
Job.getInstance(jobConf);
+List splitList = 
super.getSplits(jobContext);
--- End diff --

Are invalid segments are only useful for CarbonMultiBlockSplit?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-24 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r108026227
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java
 ---
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+
+import java.io.IOException;
+import java.util.Properties;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.exec.FileSinkOperator;
+import org.apache.hadoop.hive.ql.io.HiveOutputFormat;
+import org.apache.hadoop.io.Writable;
+import org.apache.hadoop.mapred.FileOutputFormat;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.RecordWriter;
+import org.apache.hadoop.util.Progressable;
+
+
+public class MapredCarbonOutputFormat extends FileOutputFormat<Void, T>
--- End diff --

MapredCarbonOutputFormat is only used for creating table.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-727][WIP] add hive integ...

2017-03-23 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r107607409
  
--- Diff: dev/java-code-format-template.xml ---
@@ -34,8 +34,8 @@
   
 
   
-  
   
+  
--- End diff --

@chenliang613 @QiangCai Qiang Cai told me that it is wrong. So I change it  
by the way


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-727][WIP] add hive integ...

2017-03-19 Thread cenyuhai
GitHub user cenyuhai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/672

[CARBONDATA-727][WIP] add hive integration for carbon

add support for carbondata

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cenyuhai/incubator-carbondata CARBONDATA-727

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/672.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #672


commit 3a578a1b33720674d7e7adae194af11a6e7fb9de
Author: cenyuhai <cenyu...@didichuxing.com>
Date:   2017-03-12T15:17:40Z

add hive integration for carbon

add hive integration to assembly

alter CarbonInputFormat to implement mapred.InputFormat

add a hive serde for carbon

add hive integration to assembly

fix error in getQueryModel

add debug info

add debug info

add debug info

add debug info

fix error in CarbonRecordReader

use ArrayWritable for CarbonRecordReader

fix error in initializing CarbonRecordReader

fix error in initializing CarbonRecordReader

fix error in initializing CarbonRecordReader

fix error in initializing CarbonRecordReader

修改InputFormat的返回值

把需要查的列设置到carbon里去

fix nullpoint exception

add catalyst depedency

add catalyst depedency

add catalyst depedency

fix error in intializing carbon error

add a new hive carbon recordreader

添加把object序列化成ArrayWritable的代码

short/int等数据类型在Carbon当中实际上是Long类型

use right inspector

use right inspector

fix long can't cast int error

fix decimal cast error

column size is not equal to column type

column size is not equal to column type

column size is not equal to column type

column size is not equal to column type

fix ObjInspector error

fix ObjInspector error

fix ObjInspector error

add a new hive input split

should not combine path

add support for timestamp

clean codes

remove unused codes

support Date and TimeStamp type




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #391: [CARBONDATA-374] support smallint

2016-12-05 Thread cenyuhai
Github user cenyuhai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/391
  
@ravipesala @chenliang613 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #391: [CARBONDATA-374] support smallint

2016-12-04 Thread cenyuhai
Github user cenyuhai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/391
  
@jackylk @lion-x pls review the codes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #391: [CARBONDATA-374] support smallint

2016-12-04 Thread cenyuhai
GitHub user cenyuhai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/391

[CARBONDATA-374] support smallint

**What changes were proposed in this pull request?**

support smallint type

**How to test?**
Test with TestCreateTable.scala

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cenyuhai/incubator-carbondata CARBON-374

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/391.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #391


commit 9e238a441576f93f2b7bc5e85a72c99c0b138130
Author: cenyuhai <261810...@qq.com>
Date:   2016-12-04T10:17:44Z

support smallint




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #328: [CARBONDATA-374] Support smallint

2016-12-03 Thread cenyuhai
Github user cenyuhai closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/328


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #328: [CARBONDATA-374] Support smallint

2016-11-20 Thread cenyuhai
Github user cenyuhai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/328
  
@Hexiaoqiao There is a doc about what datatype carbon are supported.
https://cwiki.apache.org/confluence/display/CARBONDATA/Carbon+Data+Types

Short is not working because there is a bug in carbon, so it will not be a 
problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---