[jira] [Assigned] (CARBONDATA-489) spark2 decimal issue

2016-12-02 Thread Fei Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Wang reassigned CARBONDATA-489:
---

Assignee: Fei Wang

> spark2 decimal issue
> 
>
> Key: CARBONDATA-489
> URL: https://issues.apache.org/jira/browse/CARBONDATA-489
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: spark-integration
>Reporter: Fei Wang
>Assignee: Fei Wang
> Fix For: 0.3.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> create a table with decimal field and query it will throw error, do not 
> support decimal(0, 0)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #386: [CARBONDATA-489] Fix spark2 decimal ...

2016-12-02 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/386#discussion_r90751746
  
--- Diff: 
integration/spark2/src/test/scala/org/apache/spark/carbondata/CarbonDataSourceSuite.scala
 ---
@@ -30,14 +30,15 @@ class CarbonDataSourceSuite extends FunSuite with 
BeforeAndAfterAll {
   .appName("CarbonExample")
   .enableHiveSupport()
   .config(CarbonCommonConstants.STORE_LOCATION,
-s"examples/spark2/target/store")
+s"/user/hive/warehouse/store")
--- End diff --

Is it ok to hard code this? how about in non-linux?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #386: [CARBONDATA-489] Fix spark2 decimal ...

2016-12-02 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/386#discussion_r90751754
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonMetastore.scala
 ---
@@ -125,6 +125,16 @@ class CarbonMetastore(conf: RuntimeConfig, val 
storePath: String) extends Loggin
 tableCreationTime
   }
 
+  def cleanStore(): Unit = {
+try {
+  val fileType = FileFactory.getFileType(storePath)
+  FileFactory.deleteFile(storePath, fileType)
+} catch {
+  case e => logError("clean store failed", e)
+}
+
--- End diff --

remove empty line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #386: [CARBONDATA-489] Fix spark2 decimal ...

2016-12-02 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/386#discussion_r90751737
  
--- Diff: pom.xml ---
@@ -309,9 +309,6 @@
 
 
   spark-1.5
-  
--- End diff --

do not modify this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #387: [CARBONDATA-490] [SPARK2] Unify all RDD in ...

2016-12-02 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/387
  
CI:
http://136.243.101.176:8080/job/ApacheCarbonManualPRBuilder/736/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #387: [CARBONDATA-490] [SPARK2] Unify all ...

2016-12-02 Thread jackylk
GitHub user jackylk opened a pull request:

https://github.com/apache/incubator-carbondata/pull/387

[CARBONDATA-490] [SPARK2] Unify all RDD in carbon-spark and carbon-spark2 
modules

Currently there are duplicate RDD in carbon-spark and carbon-spark2 
modules. This PR unify them and move them to carbon-spark-common module.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jackylk/incubator-carbondata rdd

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/387.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #387


commit 4c8513e57e5e622d68b29f10b2181ac796659ee4
Author: jackylk 
Date:   2016-12-02T15:54:49Z

modify all RDD

commit ee18fd20bf318141b0295aef02c98fc901aea05b
Author: jackylk 
Date:   2016-12-03T05:48:29Z

rebase




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-490) Unify all RDD in carbon-spark and carbon-spark2 module

2016-12-02 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-490:
---

 Summary: Unify all RDD in carbon-spark and carbon-spark2 module
 Key: CARBONDATA-490
 URL: https://issues.apache.org/jira/browse/CARBONDATA-490
 Project: CarbonData
  Issue Type: Improvement
Reporter: Jacky Li


Currently there are duplicate RDD in carbon-spark and carbon-spark2 modules



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-489) spark2 decimal issue

2016-12-02 Thread Fei Wang (JIRA)
Fei Wang created CARBONDATA-489:
---

 Summary: spark2 decimal issue
 Key: CARBONDATA-489
 URL: https://issues.apache.org/jira/browse/CARBONDATA-489
 Project: CarbonData
  Issue Type: Sub-task
  Components: spark-integration
Reporter: Fei Wang


create a table with decimal field and query it will throw error, do not support 
decimal(0, 0)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #386: Fix spark2 decimal issue

2016-12-02 Thread scwf
GitHub user scwf opened a pull request:

https://github.com/apache/incubator-carbondata/pull/386

Fix spark2 decimal issue

also added a test suite for decimal

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/KirinKing/incubator-carbondata fix-decimal

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/386.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #386


commit 4da98d3c54021d8744809df0e779fac2dbc34a5a
Author: wangfei 
Date:   2016-12-03T04:44:07Z

fix spark2 decimal

commit f351c8d6e586019d13a087dddaddb759127ab948
Author: wangfei 
Date:   2016-12-03T04:52:30Z

code clean




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #385: Fix spark2 decimal

2016-12-02 Thread scwf
Github user scwf closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/385


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #385: Fix spark2 decimal

2016-12-02 Thread scwf
GitHub user scwf opened a pull request:

https://github.com/apache/incubator-carbondata/pull/385

Fix spark2 decimal

also added test suite for decimal

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/KirinKing/incubator-carbondata SJS

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/385.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #385


commit cf11617c2e8c4480873356ce2e2333a4bc8a180f
Author: wangfei 
Date:   2016-12-03T00:34:53Z

add sjs testsuite

commit c1587b87d8de2bc9a95da1bd35f2dc59aa6376a6
Author: wangfei 
Date:   2016-12-03T01:26:01Z

decimal => double and date => string

commit 76c8f66486dabf51ae5cf184846d991e6affce6d
Author: wangfei 
Date:   2016-12-03T01:59:13Z

use original datatype to create table

commit 7b27e416fa8575a7f0abb0e01689f9bf1b7a7da5
Author: wangfei 
Date:   2016-12-03T01:59:50Z

date => string

commit 4b40a64051b2dfa13e17f53e88915e92d8b04182
Author: wangfei 
Date:   2016-12-03T02:25:42Z

added clean store

commit 34537e8ff9681c84102e006c1132709b0912ea60
Author: wangfei 
Date:   2016-12-03T02:26:28Z

fix 4.2.7 cartession join

commit e3280cf6647ee958f35b8ce623d17cac5195bcd2
Author: wangfei 
Date:   2016-12-03T02:29:45Z

fix 4.2.4




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (CARBONDATA-488) add InsertInto feature for spark2

2016-12-02 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-488.
-
Resolution: Fixed

> add InsertInto feature for spark2
> -
>
> Key: CARBONDATA-488
> URL: https://issues.apache.org/jira/browse/CARBONDATA-488
> Project: CarbonData
>  Issue Type: New Feature
>  Components: data-load
>Affects Versions: 0.3.0-incubating
>Reporter: QiangCai
>Assignee: QiangCai
> Fix For: 0.3.0-incubating
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #384: [CARBONDATA-488][SPARK2]add InsertIn...

2016-12-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/384


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #384: [CARBONDATA-488][SPARK2]add InsertInto feat...

2016-12-02 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/384
  
CI
http://136.243.101.176:8080/job/ApacheCarbonManualPRBuilder/735/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #384: [CARBONDATA-488][SPARK2]add InsertInto feat...

2016-12-02 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/384
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (CARBONDATA-487) spark2 integration is not compiling

2016-12-02 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-487.
-
Resolution: Fixed
  Assignee: Jacky Li

> spark2 integration is not compiling
> ---
>
> Key: CARBONDATA-487
> URL: https://issues.apache.org/jira/browse/CARBONDATA-487
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 0.3.0-incubating
>
>
> spark2 integration is not compiling



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #383: [CARBONDARA-487] fix spark2 compilat...

2016-12-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/383


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #362: [CARBONDATA-459] Block distribution ...

2016-12-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/362


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #384: [CARBONDATA-488][SPARK2]add InsertIn...

2016-12-02 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/384#discussion_r90694568
  
--- Diff: 
examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonExample.scala
 ---
@@ -17,79 +17,97 @@
 
 package org.apache.spark.sql.examples
 
+import java.io.File
+
+import org.apache.commons.io.FileUtils
 import org.apache.spark.sql.SparkSession
-import org.apache.spark.util.TableLoader
 
 import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
 
 object CarbonExample {
 
   def main(args: Array[String]): Unit = {
-// to run the example, plz change this path to your local machine path
-val rootPath = "/home/david/Documents/incubator-carbondata"
+val rootPath = new File(this.getClass.getResource("/").getPath
++ "../../../..").getCanonicalPath
+val storeLocation = s"$rootPath/examples/spark2/target/store"
+val warehouse = s"$rootPath/examples/spark2/target/warehouse"
+val metastoredb = s"$rootPath/examples/spark2/target/metastore_db"
+
+// clean data folder
+if (true) {
--- End diff --

hard coded true?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #384: [CARBONDATA-488][SPARK2]add InsertIn...

2016-12-02 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/384

[CARBONDATA-488][SPARK2]add InsertInto feature for spark2

1. add InsertInto feature for spark2

2. optimize CarbonExample to use relation path
And use InsertInto to load data

Link:
[CARBONDATA-488](https://issues.apache.org/jira/browse/CARBONDATA-488)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata 
insertinto_for_spark2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/384.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #384


commit a1b3b2962a8e12f09cf5efabc15de071e105c885
Author: QiangCai 
Date:   2016-12-02T17:53:32Z

insertinto for spark2




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-488) add InsertInto feature for spark2

2016-12-02 Thread QiangCai (JIRA)
QiangCai created CARBONDATA-488:
---

 Summary: add InsertInto feature for spark2
 Key: CARBONDATA-488
 URL: https://issues.apache.org/jira/browse/CARBONDATA-488
 Project: CarbonData
  Issue Type: New Feature
  Components: data-load
Affects Versions: 0.3.0-incubating
Reporter: QiangCai
Assignee: QiangCai
 Fix For: 0.3.0-incubating






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-486) Reading dataframe concurrently will lead to wrong data

2016-12-02 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-486.
-
Resolution: Fixed

> Reading dataframe concurrently will lead to wrong data
> --
>
> Key: CARBONDATA-486
> URL: https://issues.apache.org/jira/browse/CARBONDATA-486
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 0.3.0-incubating
>Reporter: QiangCai
>Assignee: QiangCai
> Fix For: 0.3.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #382: [CARBONDATA-486]fix bug for reading ...

2016-12-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/382


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #382: [CARBONDATA-486]fix bug for reading datafra...

2016-12-02 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/382
  
CI passed
http://136.243.101.176:8080/job/ApacheCarbonManualPRBuilder/734/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #382: [CARBONDATA-486]fix bug for reading datafra...

2016-12-02 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/382
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #382: [CARBONDATA-486]fix bug for reading ...

2016-12-02 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/382

[CARBONDATA-486]fix bug for reading dataframe concurrently

Fix a insertinto bug for reading from hive table concurrently

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata 
fixbugforinsertinto2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/382.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #382


commit 9dcdf7de6bde64d1c800fd268f2099d2278e8f33
Author: QiangCai 
Date:   2016-12-02T09:41:23Z

fix bug for reading dataframe concurrently




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (CARBONDATA-486) Reading dataframe concurrently will lead to wrong data

2016-12-02 Thread QiangCai (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

QiangCai updated CARBONDATA-486:

Summary: Reading dataframe concurrently will lead to wrong data  (was: 
Rreading dataframe concurrently will lead to wrong data)

> Reading dataframe concurrently will lead to wrong data
> --
>
> Key: CARBONDATA-486
> URL: https://issues.apache.org/jira/browse/CARBONDATA-486
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 0.3.0-incubating
>Reporter: QiangCai
>Assignee: QiangCai
> Fix For: 0.3.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-486) Rreading dataframe concurrently will lead to wrong data

2016-12-02 Thread QiangCai (JIRA)
QiangCai created CARBONDATA-486:
---

 Summary: Rreading dataframe concurrently will lead to wrong data
 Key: CARBONDATA-486
 URL: https://issues.apache.org/jira/browse/CARBONDATA-486
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 0.3.0-incubating
Reporter: QiangCai
Assignee: QiangCai
 Fix For: 0.3.0-incubating






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #381: [CARBONDATA-485] Refactored code for...

2016-12-02 Thread kunal642
GitHub user kunal642 opened a pull request:

https://github.com/apache/incubator-carbondata/pull/381

[CARBONDATA-485] Refactored code for DataGraphExecuter

Removed unused parameters
Added constants


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kunal642/incubator-carbondata code_refactor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/381.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #381


commit 4ae9e9f3220e9dc65d2c53698992e7d419d03213
Author: kunal642 
Date:   2016-12-02T11:22:06Z

Refactored code




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-485) Refactor code for DataGraphExecuter

2016-12-02 Thread Prabhat Kashyap (JIRA)
Prabhat Kashyap created CARBONDATA-485:
--

 Summary: Refactor code for DataGraphExecuter
 Key: CARBONDATA-485
 URL: https://issues.apache.org/jira/browse/CARBONDATA-485
 Project: CarbonData
  Issue Type: Improvement
Reporter: Prabhat Kashyap
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CARBONDATA-417) [Bad Records] Not created and not writen log file when logger is True and action as Fail

2016-12-02 Thread MAKAMRAGHUVARDHAN (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15714728#comment-15714728
 ] 

MAKAMRAGHUVARDHAN commented on CARBONDATA-417:
--

As per current behavior, the issue is invalid,So closed this issue

> [Bad Records] Not created and not writen log file when logger is True and 
> action as Fail
> 
>
> Key: CARBONDATA-417
> URL: https://issues.apache.org/jira/browse/CARBONDATA-417
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 0.1.1-incubating
> Environment: 3 node Cluster
>Reporter: MAKAMRAGHUVARDHAN
>Assignee: Akash R Nilugal
>Priority: Minor
>
> Steps:
> 1. Create Table:
> CREATE TABLE truefail (ID int,CUST_ID int,cust_name string) STORED BY 
> 'org.apache.carbondata.format';
> 2. Load Data having Logger as True and Action as False
> LOAD DATA INPATH 'hdfs://hacluster/Raghu/test2.csv' into table truefail 
> OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
> 'BAD_RECORDS_ACTION'='FAIL','FILEHEADER'='ID,CUST_ID,cust_name');
> 0: jdbc:hive2://ha-cluster/default>  CREATE TABLE truefail (ID int,CUST_ID 
> int,cust_name string) STORED BY 'org.apache.carbondata.format';
> +-+--+
> | result  |
> +-+--+
> +-+--+
> No rows selected (0.679 seconds)
> 0: jdbc:hive2://ha-cluster/default>  LOAD DATA INPATH 
> 'hdfs://hacluster/Raghu/test2.csv' into table truefail 
> OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
> 'BAD_RECORDS_ACTION'='FAIL','FILEHEADER'='ID,CUST_ID,cust_name');
> Error: java.lang.Exception: DataLoad failure: Data load failed due to bad 
> record ,The value  
> "987654321010111213141516171819101122334455667788990012131415161718191909192939495969798"
>  with column name CUST_ID and column data type INT is not a valid Record 
> (state=,code=0)
> 0: jdbc:hive2://ha-cluster/default>
> Actual Result: Not Creating and not written log file for bad records when 
> BAD_RECORDS_LOGGER_ENABLE'='TRUE', 'BAD_RECORDS_ACTION'='FAIL'
> Expected Result: Should create and write log file when 
> BAD_RECORDS_LOGGER_ENABLE'='TRUE', 'BAD_RECORDS_ACTION'='FAIL' for Bad records



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CARBONDATA-484) Implement LRU cache for B-Tree to ensure to avoid out memory, when too many number of tables exits and all are not frequently used.

2016-12-02 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated CARBONDATA-484:

Summary: Implement LRU cache for B-Tree to ensure to avoid out memory, when 
too many number of tables exits and all are not frequently used.  (was: LRU 
cache for B-Tree to ensure to avoid out memory, when too many number of tables 
exits and all are not frequently used.)

> Implement LRU cache for B-Tree to ensure to avoid out memory, when too many 
> number of tables exits and all are not frequently used.
> ---
>
> Key: CARBONDATA-484
> URL: https://issues.apache.org/jira/browse/CARBONDATA-484
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
> Attachments: B-Tree LRU Cache.pdf
>
>
> *LRU Cache for B-Tree*
> Problem:
> CarbonData is maintaining two level of B-Tree cache, one at the driver level 
> and another at executor level.  Currently CarbonData has the mechanism to 
> invalidate the segments and blocks cache for the invalid table segments, but 
> there is no eviction policy for the unused cached object. So the instance at 
> which complete memory is utilized then the system will not be able to process 
> any new requests.
> Solution:
> In the cache maintained at the driver level and at the executor there must be 
> objects in cache currently not in use. Therefore system should have the 
> mechanism to below mechanism.
> 1.   Set the max memory limit till which objects could be hold in the 
> memory.
> 2.   When configured memory limit reached then identify the cached 
> objects currently not in use so that the required memory could be freed 
> without impacting the existing process.
> 3.   Eviction should be done only till the required memory is not meet.
> For details please refer to attachments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CARBONDATA-484) LRU cache for B-Tree to ensure to avoid out memory, when too many number of tables exits and all are not frequently used.

2016-12-02 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated CARBONDATA-484:

Attachment: B-Tree LRU Cache.pdf

LRU cache for B-Tree Design Doc 

> LRU cache for B-Tree to ensure to avoid out memory, when too many number of 
> tables exits and all are not frequently used.
> -
>
> Key: CARBONDATA-484
> URL: https://issues.apache.org/jira/browse/CARBONDATA-484
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
> Attachments: B-Tree LRU Cache.pdf
>
>
> *LRU Cache for B-Tree*
> Problem:
> CarbonData is maintaining two level of B-Tree cache, one at the driver level 
> and another at executor level.  Currently CarbonData has the mechanism to 
> invalidate the segments and blocks cache for the invalid table segments, but 
> there is no eviction policy for the unused cached object. So the instance at 
> which complete memory is utilized then the system will not be able to process 
> any new requests.
> Solution:
> In the cache maintained at the driver level and at the executor there must be 
> objects in cache currently not in use. Therefore system should have the 
> mechanism to below mechanism.
> 1.   Set the max memory limit till which objects could be hold in the 
> memory.
> 2.   When configured memory limit reached then identify the cached 
> objects currently not in use so that the required memory could be freed 
> without impacting the existing process.
> 3.   Eviction should be done only till the required memory is not meet.
> For details please refer to attachments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CARBONDATA-484) LRU cache for B-Tree to ensure to avoid out memory, when too many number of tables exits and all are not frequently used.

2016-12-02 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated CARBONDATA-484:

Issue Type: New Feature  (was: Bug)

> LRU cache for B-Tree to ensure to avoid out memory, when too many number of 
> tables exits and all are not frequently used.
> -
>
> Key: CARBONDATA-484
> URL: https://issues.apache.org/jira/browse/CARBONDATA-484
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>
> *LRU Cache for B-Tree*
> Problem:
> CarbonData is maintaining two level of B-Tree cache, one at the driver level 
> and another at executor level.  Currently CarbonData has the mechanism to 
> invalidate the segments and blocks cache for the invalid table segments, but 
> there is no eviction policy for the unused cached object. So the instance at 
> which complete memory is utilized then the system will not be able to process 
> any new requests.
> Solution:
> In the cache maintained at the driver level and at the executor there must be 
> objects in cache currently not in use. Therefore system should have the 
> mechanism to below mechanism.
> 1.   Set the max memory limit till which objects could be hold in the 
> memory.
> 2.   When configured memory limit reached then identify the cached 
> objects currently not in use so that the required memory could be freed 
> without impacting the existing process.
> 3.   Eviction should be done only till the required memory is not meet.
> For details please refer to attachments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-484) LRU cache for B-Tree to ensure to avoid out memory, when too many number of tables exits and all are not frequently used.

2016-12-02 Thread Mohammad Shahid Khan (JIRA)
Mohammad Shahid Khan created CARBONDATA-484:
---

 Summary: LRU cache for B-Tree to ensure to avoid out memory, when 
too many number of tables exits and all are not frequently used.
 Key: CARBONDATA-484
 URL: https://issues.apache.org/jira/browse/CARBONDATA-484
 Project: CarbonData
  Issue Type: Bug
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan


*LRU Cache for B-Tree*
Problem:

CarbonData is maintaining two level of B-Tree cache, one at the driver level 
and another at executor level.  Currently CarbonData has the mechanism to 
invalidate the segments and blocks cache for the invalid table segments, but 
there is no eviction policy for the unused cached object. So the instance at 
which complete memory is utilized then the system will not be able to process 
any new requests.

Solution:

In the cache maintained at the driver level and at the executor there must be 
objects in cache currently not in use. Therefore system should have the 
mechanism to below mechanism.

1.   Set the max memory limit till which objects could be hold in the 
memory.

2.   When configured memory limit reached then identify the cached objects 
currently not in use so that the required memory could be freed without 
impacting the existing process.

3.   Eviction should be done only till the required memory is not meet.

For details please refer to attachments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata issue #333: [CARBONDATA-471]Optimized no kettle flow an...

2016-12-02 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/333
  
LGTM



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #333: [CARBONDATA-471]Optimized no kettle ...

2016-12-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/333


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-483) Add Unit Tests For core.carbon.metadata package

2016-12-02 Thread SWATI RAO (JIRA)
SWATI RAO created CARBONDATA-483:


 Summary: Add Unit Tests For core.carbon.metadata package
 Key: CARBONDATA-483
 URL: https://issues.apache.org/jira/browse/CARBONDATA-483
 Project: CarbonData
  Issue Type: Test
Reporter: SWATI RAO
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)