Re: GC problem and performance refine problem

2016-11-15 Thread Kumar Vishal
Hi An Lan, Data is already distributed, in this case may be one blocklet is returning more number of rows and other returning less because of this some task will take more time. In driver log block distribution log is not present, so it is not clear whether it is going for block di

Re: [Feature] proposal for update and delete support in Carbon data

2016-11-15 Thread Xiaoqiao He
hi Vinod, It is an expected feature for many people as Jacky mentioned. I think Update/Delete should be basic module for CarbonData, meanwhile it is complex question for distributed storage system. The solution you proposed is based on traditional 'Base + Delta' approach, which is applied on bigta

[GitHub] incubator-carbondata pull request #319: [CARBONDATA-411] Test

2016-11-15 Thread Zhangshunyu
Github user Zhangshunyu closed the pull request at: https://github.com/apache/incubator-carbondata/pull/319 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] incubator-carbondata pull request #320: [CARBONDATA-412][WIP]Fix load bug wh...

2016-11-15 Thread Jay357089
GitHub user Jay357089 opened a pull request: https://github.com/apache/incubator-carbondata/pull/320 [CARBONDATA-412][WIP]Fix load bug when table name has '_' https://issues.apache.org/jira/browse/CARBONDATA-412 ## Reason: this is because in windows, file separator is

[jira] [Created] (CARBONDATA-412) in windows, when load into table whose name has "_", the old segment will be deleted.

2016-11-15 Thread Jay (JIRA)
Jay created CARBONDATA-412: -- Summary: in windows, when load into table whose name has "_", the old segment will be deleted. Key: CARBONDATA-412 URL: https://issues.apache.org/jira/browse/CARBONDATA-412 Proje

Re: GC problem and performance refine problem

2016-11-15 Thread An Lan
Hi Kumar Vishal, 1. I found the quantity of rows filtered out by invert index is not uniform between different tasks and the difference is large. Some task may be 3~4k row after filtered, but the longer tasks may be 3~4w. When most longer task on same node, time cost will be more longer than other

[GitHub] incubator-carbondata pull request #318: [WIP] Dictionary server implementati...

2016-11-15 Thread ravipesala
Github user ravipesala closed the pull request at: https://github.com/apache/incubator-carbondata/pull/318 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if t

[GitHub] incubator-carbondata pull request #319: [CARBONDATA-411] Test

2016-11-15 Thread Zhangshunyu
GitHub user Zhangshunyu opened a pull request: https://github.com/apache/incubator-carbondata/pull/319 [CARBONDATA-411] Test test You can merge this pull request into a Git repository by running: $ git pull https://github.com/Zhangshunyu/incubator-carbondata a Alternatively y

[jira] [Created] (CARBONDATA-411) test

2016-11-15 Thread zhangshunyu (JIRA)
zhangshunyu created CARBONDATA-411: -- Summary: test Key: CARBONDATA-411 URL: https://issues.apache.org/jira/browse/CARBONDATA-411 Project: CarbonData Issue Type: Improvement Compone

[GitHub] incubator-carbondata pull request #311: [CARBONDATA-403] Add example for dat...

2016-11-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/incubator-carbondata/pull/311 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the f

[GitHub] incubator-carbondata pull request #267: [CARBONDATA-340] implement test case...

2016-11-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/incubator-carbondata/pull/267 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the f

[GitHub] incubator-carbondata pull request #318: [WIP] Dictionary server implementati...

2016-11-15 Thread ravipesala
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/318 [WIP] Dictionary server implementation for single pass data load It is work under progress, we can review the design of this PR You can merge this pull request into a Git repository

Re: [Feature] proposal for update and delete support in Carbon data

2016-11-15 Thread Jacky Li
Hi Vinod, It is great to have this feature, as there were many people asking for data update during the CarbonData meetup earlier. I believe it will be useful for many big data applications. For the solution you proposed, I have following doubts: 1. Data update is complex as if transaction is