[jira] [Updated] (KYLIN-2532) Make Hive flat step more extensible

2017-04-13 Thread Roger Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Shi updated KYLIN-2532:
-
Description: 
How to import data from Hive is defined in Kylin itself. We should make it more 
extensible for advanced usage case.

For example, if we want to control the hive records number imported to Kylin 
for testing, there's no injection point. Other more advanced cases also require 
more injection points in this step.

Currently we build Hive command by appending strings in code. It might be 
better if we replace it with some template files proposed by [~albertoramon].

  was:How to import data from Hive is defined in Kylin itself. We should make 
it more extensible for advanced usage case.


> Make Hive flat step more extensible
> ---
>
> Key: KYLIN-2532
> URL: https://issues.apache.org/jira/browse/KYLIN-2532
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Roger Shi
>Priority: Minor
>
> How to import data from Hive is defined in Kylin itself. We should make it 
> more extensible for advanced usage case.
> For example, if we want to control the hive records number imported to Kylin 
> for testing, there's no injection point. Other more advanced cases also 
> require more injection points in this step.
> Currently we build Hive command by appending strings in code. It might be 
> better if we replace it with some template files proposed by [~albertoramon].



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KYLIN-2506) Refactor Global Dictionary

2017-04-13 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15967478#comment-15967478
 ] 

kangkaisen commented on KYLIN-2506:
---

So, up to now, How do we ensure the correctness of the global dict in 
distributed env?

1 Distributed lock: it ensure only one thread could write the global dict at 
the same time.
2 MVCC: we write the global dict in the working dir and read the global dict 
form the versions dir.
3 every time we read the global dict, we will construct the 
AppendTrieDictionary from the metadata in the latestVersion dir.

Based on above 3 points, we could ensure global dict is sequential write and 
parallel read in distributed env.

> Refactor Global Dictionary
> --
>
> Key: KYLIN-2506
> URL: https://issues.apache.org/jira/browse/KYLIN-2506
> Project: Kylin
>  Issue Type: Improvement
>  Components: General
>Affects Versions: v2.0.0
>Reporter: kangkaisen
>Assignee: kangkaisen
> Fix For: v2.0.0
>
>
> The main points of this refactor:
> 1 Fix the bug that the RemoveListener of LoadingCache swallowed any 
> exceptions when building the GlobalDict.
> 2 Fix the bug that the HDFS filename of DictSliceKey had Illegal characters.
> 3 Fix the bug that the HDFS filename of DictSliceKey maybe longer than 255.
> 4 Fix the bug that DictNode split failed if value length greater than 255 
> bytes.
> 5 Decouple the build and query of GlobalDict: 
> Abstract the builder of AppendTrieDictionary to AppendTrieDictionaryBuilder; 
> Add LoadingCache to AppendTrieDictionary and make AppendTrieDictionary is 
> only readable.
> 6 Remove dependence of LoadingCache when building the GlobalDict.
> 7 Abstract the HDFS operations to GlobalDictStore.
> 8 Abstract the metadata of GlobalDict to GlobalDictMetadata.
> 9 Delete CachedTreeMap.
> 10 Add distributed lock for GlobalDict.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KYLIN-472) Ambari plugin to manage Kylin service

2017-04-13 Thread liyang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyang updated KYLIN-472:
-
Description: 
Ambari is great tool to manage most of Hadoop components in one single place:

![screen shot 2014-10-30 at 4 08 15 
pm|https://cloud.githubusercontent.com/assets/1104017/4840168/33880466-600c-11e4-8f70-c68e1e6bf353.png]

It make sense to use Ambari to easy manage Kylin serivce in same place.
Requirement:
1. Deploy and install Kylin service via Ambari
2. Start and Stop Kylin service through Ambari
3. Display Kylin service status on Ambari Web

[Call Volunteer!]
--
We are not Ambari expert, please let's know if you have interesting to 
contribute on this!


  was:
Ambari is great tool to manage most of Hadoop components in one single place:

![screen shot 2014-10-30 at 4 08 15 
pm|https://cloud.githubusercontent.com/assets/1104017/4840168/33880466-600c-11e4-8f70-c68e1e6bf353.png]

It make sense to use Ambari to easy manage Kylin serivce in same place.
Requirement:
1. Deploy and install Kylin service via Ambari
2. Start and Stop Kylin service through Ambari
3. Display Kylin service status on Ambari Web

[Call Volunteer!]
--
We are not Ambari expert, please let's know if you have interesting to 
contribute on this!

 Imported from GitHub 
Url: https://github.com/KylinOLAP/Kylin/issues/33
Created by: [lukehan|https://github.com/lukehan]
Labels: 
Created at: Thu Oct 30 16:14:46 CST 2014
State: open



> Ambari plugin to manage Kylin service
> -
>
> Key: KYLIN-472
> URL: https://issues.apache.org/jira/browse/KYLIN-472
> Project: Kylin
>  Issue Type: Wish
>Reporter: Luke Han
>Assignee: Yifan Zhang
>  Labels: bigtask, github-import
>
> Ambari is great tool to manage most of Hadoop components in one single place:
> ![screen shot 2014-10-30 at 4 08 15 
> pm|https://cloud.githubusercontent.com/assets/1104017/4840168/33880466-600c-11e4-8f70-c68e1e6bf353.png]
> It make sense to use Ambari to easy manage Kylin serivce in same place.
> Requirement:
> 1. Deploy and install Kylin service via Ambari
> 2. Start and Stop Kylin service through Ambari
> 3. Display Kylin service status on Ambari Web
> [Call Volunteer!]
> --
> We are not Ambari expert, please let's know if you have interesting to 
> contribute on this!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)