[jira] [Created] (CARBONDATA-713) Use store location in properties when user didn't pass the location as the parameter of the constructor

2017-02-20 Thread Yadong Qi (JIRA)
Yadong Qi created CARBONDATA-713:


 Summary: Use store location in properties when user didn't pass 
the location as the parameter of the constructor
 Key: CARBONDATA-713
 URL: https://issues.apache.org/jira/browse/CARBONDATA-713
 Project: CarbonData
  Issue Type: Bug
  Components: spark-integration
Affects Versions: 1.1.0-incubating
Reporter: Yadong Qi


The store location of carbon will come from 3 places:
1. default location path in code(../carbon.store)
2. configurate "carbon.storelocation" in carbon.properties
3. pass the location as the parameter of the constructor
The priority is low to high.

But when I create a CarbonContext or CarbonSession without any parameters and 
configurate "carbon.storelocation" in carbon.properties, the final value of 
location is defalut(../carbon.store)





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-714) DOCUMENTATION - How to handle the bad records can be documented

2017-02-20 Thread Gururaj Shetty (JIRA)
Gururaj Shetty created CARBONDATA-714:
-

 Summary: DOCUMENTATION - How to handle the bad records can be 
documented
 Key: CARBONDATA-714
 URL: https://issues.apache.org/jira/browse/CARBONDATA-714
 Project: CarbonData
  Issue Type: Improvement
Reporter: Gururaj Shetty
Priority: Minor


A TroubleShooting topic can be added on how to handle the bad records:
Some of the solution which can be captures are:
1. Writing to CSV. What are the properties user need to set
2. Null
3. Fail when there is a bad records
Etc



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[DISCUSS] Graduation to a TLP (Top Level Project)

2017-02-20 Thread Jean-Baptiste Onofré

Hi all,

Regarding all work and progress we made so far in Apache CarbonData, I 
think it's time to start the discussion about graduation as a new TLP 
(Top Level Project) at the Apache Software Foundation.


Graduation means we are a self-sustaining and self-governing community, 
and ready to be a full participant in the Apache Software Foundation. Of 
course, it doesn't imply that our community growth is complete or that a 
particular level of technical maturity has been reached, rather that we 
are on a solid trajectory in those areas. After graduation, we will 
still periodically report to the ASF Board to ensure continued growth of 
a healthy community.


Graduation is an important milestone for the project and for the users 
community.


A way to think about graduation readiness is through the Apache Maturity 
Model [1]. I think we satisfy most of the requirements [2].
There are some TODOs to address. I will tackle in the coming days 
(release guide, security link, ...).


Regarding the process, graduation consists of drafting a board 
resolution, which needs to identify the full Project Management 
Committee, and getting it approved by the community, the Incubator, and 
the Board. Within the CarbonData community, most of these discussions 
and votes have to be on the private@ mailing list.


I would like to summarize here from points arguing in favor of graduation:
* Project's maturity self-assessment [2]
* 600 pull requests in incubation
* 5 releases (including RC) performed by two different release manager
* 65 contributors
* 4 new committers
* 713 Jira created, 593 resolved or closed

Thoughts ? If you agree, I would like to share the maturity 
self-assessment on the website.


If you want to help me on some TODO tasks, please, ping me by e-mail, 
Skype, hangout or whatever, to sync together.


Thanks !
Regards
JB

[1] http://community.apache.org/apache-way/apache-project-
maturity-model.html
[2] 
https://docs.google.com/document/d/12hifkDCfbyramBba1uRHYjwaKEcxAyWMxS9iwJ1_etY/edit?usp=sharing

--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: [DISCUSS] Graduation to a TLP (Top Level Project)

2017-02-20 Thread Jean-Baptiste Onofré

By the way, I gonna create the Jira and pull requests for the pending TODO.

Regards
JB

On 02/20/2017 05:28 PM, Jean-Baptiste Onofré wrote:

Hi all,

Regarding all work and progress we made so far in Apache CarbonData, I
think it's time to start the discussion about graduation as a new TLP
(Top Level Project) at the Apache Software Foundation.

Graduation means we are a self-sustaining and self-governing community,
and ready to be a full participant in the Apache Software Foundation. Of
course, it doesn't imply that our community growth is complete or that a
particular level of technical maturity has been reached, rather that we
are on a solid trajectory in those areas. After graduation, we will
still periodically report to the ASF Board to ensure continued growth of
a healthy community.

Graduation is an important milestone for the project and for the users
community.

A way to think about graduation readiness is through the Apache Maturity
Model [1]. I think we satisfy most of the requirements [2].
There are some TODOs to address. I will tackle in the coming days
(release guide, security link, ...).

Regarding the process, graduation consists of drafting a board
resolution, which needs to identify the full Project Management
Committee, and getting it approved by the community, the Incubator, and
the Board. Within the CarbonData community, most of these discussions
and votes have to be on the private@ mailing list.

I would like to summarize here from points arguing in favor of graduation:
* Project's maturity self-assessment [2]
* 600 pull requests in incubation
* 5 releases (including RC) performed by two different release manager
* 65 contributors
* 4 new committers
* 713 Jira created, 593 resolved or closed

Thoughts ? If you agree, I would like to share the maturity
self-assessment on the website.

If you want to help me on some TODO tasks, please, ping me by e-mail,
Skype, hangout or whatever, to sync together.

Thanks !
Regards
JB

[1] http://community.apache.org/apache-way/apache-project-
maturity-model.html
[2]
https://docs.google.com/document/d/12hifkDCfbyramBba1uRHYjwaKEcxAyWMxS9iwJ1_etY/edit?usp=sharing



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: data lost when loading data from csv file to carbon table

2017-02-20 Thread Liang Chen
Hi 

Already raised one JIAR issue:How to handle the bad records.
https://issues.apache.org/jira/browse/CARBONDATA-714

Regards
Liang



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/data-lost-when-loading-data-from-csv-file-to-carbon-table-tp7554p7717.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


Re: [DISCUSS] Graduation to a TLP (Top Level Project)

2017-02-20 Thread Liang Chen
Hi JB

Thanks for you started the discussion and driving it.
I will ping you by skype and email to complete some TODO tasks.

One query:for license analysis section, why are there many unknown licenses?
do we need to fix it ?

Regards
Liang



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/DISCUSS-Graduation-to-a-TLP-Top-Level-Project-tp7715p7718.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


[GitHub] incubator-carbondata-site issue #17: Updated the meetup page,links, ddl file...

2017-02-20 Thread chenliang613
Github user chenliang613 commented on the issue:

https://github.com/apache/incubator-carbondata-site/pull/17
  
please rebase it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[ANNOUNCE] Hexiaoqiao as new Apache CarbonData committer

2017-02-20 Thread Liang Chen
Hi all

We are pleased to announce that the PPMC has invited Hexiaoqiao as new
Apache CarbonData committer, and the invite has been accepted !

Congrats to Hexiaoqiao and welcome aboard.

Regards
Liang


Re: [jira] [Created] (CARBONDATA-713) Use store location in properties when user didn't pass the location as the parameter of the constructor

2017-02-20 Thread Liang Chen
Hi
Can you explain more, why this is a bug?
System uses "/carbon.store" as default store location if users don't specify
the parameter.

Regards
Liang


JIRA j...@apache.org wrote
> Yadong Qi created CARBONDATA-713:
> 
> 
>  Summary: Use store location in properties when user didn't
> pass the location as the parameter of the constructor
>  Key: CARBONDATA-713
>  URL: https://issues.apache.org/jira/browse/CARBONDATA-713
>  Project: CarbonData
>   Issue Type: Bug
>   Components: spark-integration
> Affects Versions: 1.1.0-incubating
> Reporter: Yadong Qi
> 
> 
> The store location of carbon will come from 3 places:
> 1. default location path in code(../carbon.store)
> 2. configurate "carbon.storelocation" in carbon.properties
> 3. pass the location as the parameter of the constructor
> The priority is low to high.
> 
> But when I create a CarbonContext or CarbonSession without any parameters
> and configurate "carbon.storelocation" in carbon.properties, the final
> value of location is defalut(../carbon.store)
> 
> 
> 
> 
> 
> --
> This message was sent by Atlassian JIRA
> (v6.3.15#6346)





--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/jira-Created-CARBONDATA-713-Use-store-location-in-properties-when-user-didn-t-pass-the-location-as-tr-tp7709p7742.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


Re: Exception throws when I load data using carbondata-1.0.0

2017-02-20 Thread Ravindra Pesala
Hi,

How did you create CarbonContext?
Can you check whether you have provided same store path in
carbon.properties and the CarbonContext.

Regards,
Ravindra.

On 20 February 2017 at 12:26, Xiaoqiao He  wrote:

> Hi Ravindra,
>
> Thanks for your suggestions. But another problem met when I create table
> and load data.
>
> 1. I follow README to compile and build CarbonData actually, via
> https://github.com/apache/incubator-carbondata/blob/master/build/README.md
> :
>
> > mvn -DskipTests -Pspark-1.6 -Dspark.version=1.6.2 clean package
>
>
> 2. I think the exceptions mentioned above (ClassNotFoundException/'exists
> and does not match'), is related to configuration item of
> 'spark.executor.extraClassPath'. Since when i trace executor logs, i found
> it tries to load Class from the same path as spark.executor.extraClassPath
> config and it can not found local (this local path is valid only for
> driver), and throw exception. When I remove this item in configuration and
> run the same command with --jar parameter, then not throw this exception
> again.
>
> 3. but when i create table following quick-start as below:
>
> > scala> cc.sql("CREATE TABLE IF NOT EXISTS sample (id string, name string,
> > city string, age Int) STORED BY 'carbondata'")
>
>
> there is some info logs such as:
>
> > INFO  20-02 12:00:35,690 - main Query [CREATE TABLE TEST.SAMPLE USING
> > CARBONDATA OPTIONS (TABLENAME "TEST.SAMPLE", TABLEPATH
> > "/HOME/PATH/HEXIAOQIAO/CARBON.STORE/TEST/SAMPLE") ]
>
> and* TABLEPATH looks not the proper path (I have no idea why this path is
> not HDFS path)*, and then load data as blow but another exception throws.
>
> > scala> cc.sql("LOAD DATA INPATH
> > 'hdfs://hacluster/user/hadoop-data/sample.csv' INTO TABLE sample")
>
>
> there is some info logs such as:
>
> > INFO  20-02 12:01:27,608 - main HDFS lock
> > path:hdfs://hacluster/home/path/hexiaoqiao/carbon.store/
> test/sample/meta.lock
>
> *this lock path is not the expected hdfs path, it looks [hdfs
> scheme://authority] + local setup path of carbondata. (is storelocation not
> active?)*
> and throw exception:
>
> > INFO  20-02 12:01:42,668 - Table MetaData Unlocked Successfully after
> data
> > load
> > java.lang.RuntimeException: Table is locked for updation. Please try
> after
> > some time
> > at scala.sys.package$.error(package.scala:27)
> > at
> > org.apache.spark.sql.execution.command.LoadTable.
> run(carbonTableSchema.scala:360)
> > at
> > org.apache.spark.sql.execution.ExecutedCommand.
> sideEffectResult$lzycompute(commands.scala:58)
> > at
> > org.apache.spark.sql.execution.ExecutedCommand.
> sideEffectResult(commands.scala:56)
> > at
> > org.apache.spark.sql.execution.ExecutedCommand.
> doExecute(commands.scala:70)
>
>  ..
>
>
> CarbonData Configuration:
> carbon.storelocation=hdfs://hacluster/tmp/carbondata/carbon.store
> carbon.lock.type=HDFSLOCK
> FYI.
>
> Regards,
> Hexiaoqiao
>
>
> On Sat, Feb 18, 2017 at 3:26 PM, Ravindra Pesala 
> wrote:
>
> > Hi Xiaoqiao,
> >
> > Is the problem still exists?
> > Can you try with clean build  with  "mvn clean -DskipTests -Pspark-1.6
> > package" command.
> >
> > Regards,
> > Ravindra.
> >
> > On 16 February 2017 at 08:36, Xiaoqiao He  wrote:
> >
> > > hi Liang Chen,
> > >
> > > Thank for your help. It is true that i install and configure carbondata
> > on
> > > "spark on yarn" cluster following installation guide (
> > > https://github.com/apache/incubator-carbondata/blob/
> > > master/docs/installation-guide.md#installing-and-
> > > configuring-carbondata-on-spark-on-yarn-cluster
> > > ).
> > >
> > > Best Regards,
> > > Heixaoqiao
> > >
> > >
> > > On Thu, Feb 16, 2017 at 7:47 AM, Liang Chen 
> > > wrote:
> > >
> > > > Hi He xiaoqiao
> > > >
> > > > Quick start is local model spark.
> > > > Your case is yarn cluster , please check :
> > > > https://github.com/apache/incubator-carbondata/blob/
> > > > master/docs/installation-guide.md
> > > >
> > > > Regards
> > > > Liang
> > > >
> > > > 2017-02-15 3:29 GMT-08:00 Xiaoqiao He :
> > > >
> > > > > hi Manish Gupta,
> > > > >
> > > > > Thanks for you focus, actually i try to load data following
> > > > > https://github.com/apache/incubator-carbondata/blob/
> > > > > master/docs/quick-start-guide.md
> > > > > for deploying carbondata-1.0.0.
> > > > >
> > > > > 1.when i execute carbondata by `bin/spark-shell`, it throws as
> above.
> > > > > 2.when i execute carbondata by `bin/spark-shell --jars
> > > > > carbonlib/carbondata_2.10-1.0.0-incubating-shade-hadoop2.7.1.jar`,
> > it
> > > > > throws another exception as below:
> > > > >
> > > > > org.apache.spark.SparkException: Job aborted due to stage failure:
> > > Task
> > > > 0
> > > > > > in stage 0.0 failed 4 times, most recent failure: Lost task 0.3
> in
> > > > stage
> > > > > > 0.0 (TID 3, [task hostname]): org.apache.spark.SparkException:
> > File
> > > > > > ./carbondata_2.10-1.0.0-incubating-shade-hadoop2.7.1.jar exists
> > and
> > > > does
> > > > > > not match contents of
> > > > > > http://

Re: [ANNOUNCE] Hexiaoqiao as new Apache CarbonData committer

2017-02-20 Thread Xiaoqiao He
Hi PPMC, Liang,

It is my honor that receive the invitation, and very happy to have chance
that participate to build CarbonData community also. I will keep
contributing to Apache CarbonData and continue to promoting the practical
application on CarbonData.

Thank you again and hope CarbonData have a better development in the future.

Best Regards.
Hexiaoqiao


On Tue, Feb 21, 2017 at 9:26 AM, Liang Chen  wrote:

> Hi all
>
> We are pleased to announce that the PPMC has invited Hexiaoqiao as new
> Apache CarbonData committer, and the invite has been accepted !
>
> Congrats to Hexiaoqiao and welcome aboard.
>
> Regards
> Liang
>


[jira] [Created] (CARBONDATA-715) Optimize Single pass data load

2017-02-20 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-715:
--

 Summary: Optimize Single pass data load
 Key: CARBONDATA-715
 URL: https://issues.apache.org/jira/browse/CARBONDATA-715
 Project: CarbonData
  Issue Type: Improvement
Reporter: Ravindra Pesala


1. Upgrade to latest netty-4.1.8 
2. Optimize the serialization of key for passing in network.
3. Launch individual dictionary client for each loading thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [ANNOUNCE] Hexiaoqiao as new Apache CarbonData committer

2017-02-20 Thread Ravindra Pesala
Congratulations Hexiaoqiao.

Regards,
Ravindra.

On 21 February 2017 at 10:15, Xiaoqiao He  wrote:

> Hi PPMC, Liang,
>
> It is my honor that receive the invitation, and very happy to have chance
> that participate to build CarbonData community also. I will keep
> contributing to Apache CarbonData and continue to promoting the practical
> application on CarbonData.
>
> Thank you again and hope CarbonData have a better development in the
> future.
>
> Best Regards.
> Hexiaoqiao
>
>
> On Tue, Feb 21, 2017 at 9:26 AM, Liang Chen 
> wrote:
>
> > Hi all
> >
> > We are pleased to announce that the PPMC has invited Hexiaoqiao as new
> > Apache CarbonData committer, and the invite has been accepted !
> >
> > Congrats to Hexiaoqiao and welcome aboard.
> >
> > Regards
> > Liang
> >
>



-- 
Thanks & Regards,
Ravi


Re: [ANNOUNCE] Hexiaoqiao as new Apache CarbonData committer

2017-02-20 Thread Sea
Congratulations




-- Original --
From:  "Ravindra Pesala";;
Date:  Tue, Feb 21, 2017 02:08 PM
To:  "dev"; 

Subject:  Re: [ANNOUNCE] Hexiaoqiao as new Apache CarbonData committer



Congratulations Hexiaoqiao.

Regards,
Ravindra.

On 21 February 2017 at 10:15, Xiaoqiao He  wrote:

> Hi PPMC, Liang,
>
> It is my honor that receive the invitation, and very happy to have chance
> that participate to build CarbonData community also. I will keep
> contributing to Apache CarbonData and continue to promoting the practical
> application on CarbonData.
>
> Thank you again and hope CarbonData have a better development in the
> future.
>
> Best Regards.
> Hexiaoqiao
>
>
> On Tue, Feb 21, 2017 at 9:26 AM, Liang Chen 
> wrote:
>
> > Hi all
> >
> > We are pleased to announce that the PPMC has invited Hexiaoqiao as new
> > Apache CarbonData committer, and the invite has been accepted !
> >
> > Congrats to Hexiaoqiao and welcome aboard.
> >
> > Regards
> > Liang
> >
>



-- 
Thanks & Regards,
Ravi

Re: [ANNOUNCE] Hexiaoqiao as new Apache CarbonData committer

2017-02-20 Thread Jean-Baptiste Onofré

Welcome aboard !

Congrats,
Regards
JB

On 02/21/2017 05:45 AM, Xiaoqiao He wrote:

Hi PPMC, Liang,

It is my honor that receive the invitation, and very happy to have chance
that participate to build CarbonData community also. I will keep
contributing to Apache CarbonData and continue to promoting the practical
application on CarbonData.

Thank you again and hope CarbonData have a better development in the future.

Best Regards.
Hexiaoqiao


On Tue, Feb 21, 2017 at 9:26 AM, Liang Chen  wrote:


Hi all

We are pleased to announce that the PPMC has invited Hexiaoqiao as new
Apache CarbonData committer, and the invite has been accepted !

Congrats to Hexiaoqiao and welcome aboard.

Regards
Liang





--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: [DISCUSS] Graduation to a TLP (Top Level Project)

2017-02-20 Thread Jean-Baptiste Onofré

Hi Liang,

For the license analysis, it depends of the dependencies and the content 
of the MANIFEST. We don't have to fix it, just maybe add a note when we 
know the actual dependency license.


Regards
JB

On 02/21/2017 12:29 AM, Liang Chen wrote:

Hi JB

Thanks for you started the discussion and driving it.
I will ping you by skype and email to complete some TODO tasks.

One query:for license analysis section, why are there many unknown licenses?
do we need to fix it ?

Regards
Liang



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/DISCUSS-Graduation-to-a-TLP-Top-Level-Project-tp7715p7718.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com