[GitHub] incubator-carbondata pull request #316: [WIP]create agg table segment for ev...

2016-11-14 Thread Jay357089
GitHub user Jay357089 opened a pull request:

https://github.com/apache/incubator-carbondata/pull/316

[WIP]create agg table segment for every fact table single segment

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[CARBONDATA-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).
 - [ ] Testing done
 
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- What manual testing you have done?
- Any additional information to help reviewers in testing this 
change.
 
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
 
---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Jay357089/incubator-carbondata createAggTable

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/316.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #316


commit 92da6a1313ef5eaa532e10d46828d0e6791a4119
Author: Jay357089 
Date:   2016-11-10T07:01:37Z

create agg table segment for every fact table single segment




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-409) Drop non-existing macro executes successfully while it must give an error.

2016-11-14 Thread Sangeeta Gulia (JIRA)
Sangeeta Gulia created CARBONDATA-409:
-

 Summary: Drop non-existing macro executes successfully while it 
must give an error.
 Key: CARBONDATA-409
 URL: https://issues.apache.org/jira/browse/CARBONDATA-409
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Reporter: Sangeeta Gulia


I have created a macro :

CREATE TEMPORARY MACRO simple_add (x int, y int) x + y;
then i dropped the macro.

 > drop temporary macro simple_add;
OK
Time taken: 0.038 seconds
hive> 
> 
> select simple_add(2,3);
FAILED: SemanticException [Error 10011]: Line 1:7 Invalid function 'simple_add'
  

then i again tried to drop the same macro and it again executed without any 
exception:
> drop temporary macro simple_add;
OK
Time taken: 0.016 seconds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

2016-11-14 Thread sujith chacko
Hi liang,
Yes,  its for high cardinality columns.
Thanks,
Sujith

On Nov 14, 2016 2:01 PM, "Liang Chen"  wrote:

> Hi
>
> I have one query : for no dictionary columns which are high cardinality
> like phone number, Whether the pruning cost is hight,or not ?
>
> Regards
> Liang
>
> 2016-11-14 15:18 GMT+08:00 sujith chacko :
>
> > Hi All,
> >
> >   I am going to  optimize the LIKE Filter query flow for no-dictionary
> > columns, please find the details mentioned below.
> >
> > *Current design:*
> > For Like filter queries no push down is happening to carbon layer,
> because
> > of this there will be no block/blocklet level pruning which can happen
> > before applying the LIKE filters, this can add overhead while scanning
> > since the system has to scan all the blocks and blocklets in order to
> apply
> > filters.
> >
> > *Proposed design/solution:*
> > Like filters(startsWith,endsWith,contains) can be pushed to carbon
> engine
> > layer so that carbon can perform block and blocklet level pruning inorder
> > before applying filters.
> > Block level pruning will be happening in driver side and blocklet level
> > pruning will be done in executer as per existing design.
> >
> > Requesting all to please provide valuable feedback and vote for
> > implementing the above solution inorder to  improve Like Filter Queries.
> >
> > Thanks,
> > Sujith
> >
>
>
>
> --
> Regards
> Liang
>


Re: Single Pass Data Load Design

2016-11-14 Thread Liang Chen
Hi

Yes, good feature. This improvement would significantly improve data load
performance.
Can you provide a sequence diagram for the whole data load process?

Regards
Liang

2016-11-14 15:42 GMT+08:00 Jacky Li :

> Hi Ravindra,
>
> Thanks for proposing this design. It is really exciting if CarbonData can
> do
> 1-pass solution for loading. I have given some comment in the design
> document.
>
> Regards,
> Jacky
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/Single-Pass-
> Data-Load-Design-tp2875p2894.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>



-- 
Regards
Liang


[GitHub] incubator-carbondata pull request #313: [CARBONDATA-405]Fixed Data load fail...

2016-11-14 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/313


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

2016-11-14 Thread Ravindra Pesala
+1

On Mon, Nov 14, 2016, 3:54 PM sujith chacko 
wrote:

> Hi liang,
> Yes,  its for high cardinality columns.
> Thanks,
> Sujith
>
> On Nov 14, 2016 2:01 PM, "Liang Chen"  wrote:
>
> > Hi
> >
> > I have one query : for no dictionary columns which are high cardinality
> > like phone number, Whether the pruning cost is hight,or not ?
> >
> > Regards
> > Liang
> >
> > 2016-11-14 15:18 GMT+08:00 sujith chacko :
> >
> > > Hi All,
> > >
> > >   I am going to  optimize the LIKE Filter query flow for no-dictionary
> > > columns, please find the details mentioned below.
> > >
> > > *Current design:*
> > > For Like filter queries no push down is happening to carbon layer,
> > because
> > > of this there will be no block/blocklet level pruning which can happen
> > > before applying the LIKE filters, this can add overhead while scanning
> > > since the system has to scan all the blocks and blocklets in order to
> > apply
> > > filters.
> > >
> > > *Proposed design/solution:*
> > > Like filters(startsWith,endsWith,contains) can be pushed to carbon
> > engine
> > > layer so that carbon can perform block and blocklet level pruning
> inorder
> > > before applying filters.
> > > Block level pruning will be happening in driver side and blocklet level
> > > pruning will be done in executer as per existing design.
> > >
> > > Requesting all to please provide valuable feedback and vote for
> > > implementing the above solution inorder to  improve Like Filter
> Queries.
> > >
> > > Thanks,
> > > Sujith
> > >
> >
> >
> >
> > --
> > Regards
> > Liang
> >
>


[Feature] proposal for update and delete support in Carbon data

2016-11-14 Thread Vinod KC
Hi All
I would like to propose following new features in Carbon data
1) Update statement to support modifying existing records in carbon data
table
2) Delete statement to remove records from carbon data table

A) Update operation: 'Update' features can be added to CarbonData using
intermediate Delta files [delete/update delta files] support with lesser
impact on existing code.
Update can be considered as a ‘delete’ followed by an‘insert’ operation.
Once an update is done on carbon data file, on select query operation,
Carbondata store reader can make use of delete delta data cache to exclude
deleted records in that segment and then include records from newly added
update delta files.

B) Delete operation: In the case of delete operation, a delete delta file
will be added to each segment matching the records. During select query
operation Carbon data reader will exclude those deleted records from the
result set.

Please share your suggestions and thoughts about design and functional
aspects on this feature. I’ll share a detailed design document about above
thoughts later.

Regards
Vinod


Re: [RESULT][VOTE] Apache CarbonData 0.2.0-incubating release

2016-11-14 Thread Uma gangumalla
Sorry for coming late on this.

here is my +1 (binding) too.

I will carry my +1 to incubator list as well.

Regards,
Uma

On Sun, Nov 13, 2016 at 6:20 AM, Liang Chen  wrote:

> Hi
>
> PPMC vote has passed, the result as below:
> +1(binding) : 6
> +1(non-binding) : 6
> Thanks all for your vote.
>
> Regards
> Liang
>
> Liang Chen wrote
> > Hi all,
> >
> > I submit the CarbonData 0.2.0-incubating to your vote.
> >
> > Release Notes:
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12320220=12337896
> >
> > Staging Repository:
> > https://repository.apache.org/content/repositories/
> orgapachecarbondata-1006
> >
> > Git Tag:
> > carbondata-0.2.0-incubating
> >
> > Please vote to approve this release:
> > [ ] +1 Approve the release
> > [ ] -1 Don't approve the release (please provide specific comments)
> >
> > This vote will be open for at least 72 hours. If this vote passes (we
> need
> > at least 3 binding votes, meaning three votes from the PPMC), I will
> > forward to
>
> > general@.apache
>
> >  for  the IPMC votes.
> >
> > Here is my vote : +1 (binding)
> >
> > Regards
> > Liang
>
>
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/VOTE-Apache-
> CarbonData-0-2-0-incubating-release-tp2823p2881.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>


RE: Single Pass Data Load Design

2016-11-14 Thread Jihong Ma
Hi Ravindra,

Thank you for putting together a proposal for improving data load process!

Please find my comments in-lined in the Google doc. 

Jihong

-Original Message-
From: Ravindra Pesala [mailto:ravi.pes...@gmail.com] 
Sent: Sunday, November 13, 2016 4:24 AM
To: dev
Subject: Single Pass Data Load Design

Hi All,

Please find the proposed solutions for single pass data load.

https://docs.google.com/document/d/1_sSN9lccCZo4E_X3pNP5PchQACqif3AOXKTuG-YJAcc/edit?usp=sharing
-- 
Thanks & Regards,
Ravindra