Re: corrupt metastore

2016-12-01 Thread Alberto Ramón
yes, yes,
I had this type of problems, I needed used
  hdfs fsck
  hbase hbck
And solved all problems. --> pehaps some data has been lost

The nex steps will be:
-  check metadata of Kylin
-  check consistence between metadata and Kylin's tables


But I don't know if there is some tools/commands to do this
I saw metadata.sh script, but I cant find this functionality



2016-12-02 2:46 GMT+01:00 ShaoFeng Shi :

> Hi Alberto, It looks like the HBase service is in trouble, please check it
> firstly;
>
> 2016-12-02 8:03 GMT+08:00 Alberto Ramón :
>
>> I had some problems with corrupt data on HDFS and Meta HDFS
>> Now all services started OK
>>
>> *None query is excuted in none cube *
>> *Error while executing SQL "select part_dt, sum(price) as total_selled,
>> count(distinct seller_id) as sellers from kylin_sales group by part_dt
>> order by part_dt LIMIT 5":
>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
>> attempts=5, exceptions: Fri Dec 02 07:31:07 GMT+08:00 2016,
>> org.apache.hadoop.hbase.client.RpcRetryingCaller@6cb60fb6,
>> com.google.protobuf.InvalidProtocolBufferException:
>> com.google.protobuf.InvalidProtocolBufferException: Protocol message tag
>> had invalid wire type. at
>> com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
>> at com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom*
>>
>>
>> *I tried to rebuild cube, but:*
>>
>>
>>
>>
>> *Could not read JSON: Can not construct instance of long from String
>> value '2000-12-07 06:30:00': not a valid Long value at [Source:
>> org.apache.catalina.connector.CoyoteInputStream@6fcdf2de; line: 1, column:
>> 21] (through reference chain:
>> org.apache.kylin.rest.request.JobBuildRequest["startTime"]); nested
>> exception is com.fasterxml.jackson.databind.exc.InvalidFormatException: Can
>> not construct instance of long from String value '2000-12-07 06:30:00': not
>> a valid Long value at [Source:
>> org.apache.catalina.connector.CoyoteInputStream@6fcdf2de; line: 1, column:
>> 21] (through reference chain:
>> org.apache.kylin.rest.request.JobBuildRequest["startTime"]*
>>
>> *Some idea? I'm trying to metastore.sh, there is some check tool?*
>> 2016-12-01 16:21:34,162 ERROR [pool-7-thread-1] dao.ExecutableDao:148 :
>> error get all Jobs:
>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
>> attempts=6, exceptions:
>> Fri Dec 02 05:21:34 GMT+08:00 2016, null, java.net.SocketTimeoutException:
>> callTimeout=6, callDuration=122823: row '/execute/' on table
>> 'kylin_metadata' at region=kylin_metadata,,1477759808710.faab4c9
>> 88f06f17d9e903068db5b3b81., hostname=amb0.mycorp.kom,60020,1480614855596,
>> seqNum=1664
>>
>> at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadRepl
>> icas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:262)
>> at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.c
>> all(ScannerCallableWithReplicas.java:199)
>>
>> Caused by: java.net.SocketTimeoutException: callTimeout=6,
>> callDuration=122823: row '/execute/' on table 'kylin_metadata' at
>> region=kylin_metadata,,1477759808710.faab4c988f06f17d9e903068db5b3b81.
>>
>> *(re-deploy all isn't a problem, is only for knowledge)*
>>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>


Re: corrupt metastore

2016-12-01 Thread ShaoFeng Shi
Hi Alberto, It looks like the HBase service is in trouble, please check it
firstly;

2016-12-02 8:03 GMT+08:00 Alberto Ramón :

> I had some problems with corrupt data on HDFS and Meta HDFS
> Now all services started OK
>
> *None query is excuted in none cube *
> *Error while executing SQL "select part_dt, sum(price) as total_selled,
> count(distinct seller_id) as sellers from kylin_sales group by part_dt
> order by part_dt LIMIT 5":
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> attempts=5, exceptions: Fri Dec 02 07:31:07 GMT+08:00 2016,
> org.apache.hadoop.hbase.client.RpcRetryingCaller@6cb60fb6,
> com.google.protobuf.InvalidProtocolBufferException:
> com.google.protobuf.InvalidProtocolBufferException: Protocol message tag
> had invalid wire type. at
> com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
> at com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom*
>
>
> *I tried to rebuild cube, but:*
>
>
>
>
> *Could not read JSON: Can not construct instance of long from String value
> '2000-12-07 06:30:00': not a valid Long value at [Source:
> org.apache.catalina.connector.CoyoteInputStream@6fcdf2de; line: 1, column:
> 21] (through reference chain:
> org.apache.kylin.rest.request.JobBuildRequest["startTime"]); nested
> exception is com.fasterxml.jackson.databind.exc.InvalidFormatException: Can
> not construct instance of long from String value '2000-12-07 06:30:00': not
> a valid Long value at [Source:
> org.apache.catalina.connector.CoyoteInputStream@6fcdf2de; line: 1, column:
> 21] (through reference chain:
> org.apache.kylin.rest.request.JobBuildRequest["startTime"]*
>
> *Some idea? I'm trying to metastore.sh, there is some check tool?*
> 2016-12-01 16:21:34,162 ERROR [pool-7-thread-1] dao.ExecutableDao:148 :
> error get all Jobs:
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> attempts=6, exceptions:
> Fri Dec 02 05:21:34 GMT+08:00 2016, null, java.net.SocketTimeoutException:
> callTimeout=6, callDuration=122823: row '/execute/' on table
> 'kylin_metadata' at region=kylin_metadata,,1477759808710.faab4c9
> 88f06f17d9e903068db5b3b81., hostname=amb0.mycorp.kom,60020,1480614855596,
> seqNum=1664
>
> at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadRepli
> cas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:262)
> at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.
> call(ScannerCallableWithReplicas.java:199)
>
> Caused by: java.net.SocketTimeoutException: callTimeout=6,
> callDuration=122823: row '/execute/' on table 'kylin_metadata' at
> region=kylin_metadata,,1477759808710.faab4c988f06f17d9e903068db5b3b81.
>
> *(re-deploy all isn't a problem, is only for knowledge)*
>



-- 
Best regards,

Shaofeng Shi 史少锋


Re: Consulting "EXTENDED_COLUMN"

2016-12-01 Thread Alberto Ramón
Nice Liu

We have some cases like
DayWeekTXT , DayWeekID
MonthTXT, MonthID

small proposal:
Can would be interesting create Derived with 1:1 relation, with support for
filters and Group by

2016-12-01 11:55 GMT+01:00 Billy(Yiming) Liu :

> The cost of joint dimension compared with extended column is you have more
> columns in the HBase rowkey. It may harm the query performance. But most
> time, joint dimension is still recommended, since the normal dimension
> column supports much more functions than extended column, such as count(*).
>
> 2016-12-01 17:07 GMT+08:00 Alberto Ramón :
>
>> Hello
>> I was preparing a email with related doubts:
>>
>> Some times we have derived dimensions with relation 1:1, examples:
>> WeekDayID & WeekDayTxt
>> MonthID & WeekTxt
>>
>> SOL1: Derived.  ID as Host and Txt Extended
>> PB: You can't filter / Group by Txt
>>
>> SOL2: Joint. Define tuples of ID & TXT
>> Some PB/limitation?  (I need test this option)
>>
>> 2016-12-01 0:35 GMT+01:00 Billy(Yiming) Liu :
>>
>>> Thanks, Alberto. The explanation is accurate. EXTENDED_COLUMN is only
>>> used for representation, but not filtering or grouping which is  done by
>>> HOST_COLUMN. So EXTENDED_COLUMN is not a dimension, it works like a
>>> key/value map against the HOST_COLUMN.
>>>
>>> If the value in EXTENDED_COLUMN is not long, you could just define two
>>> dimensions with joint dimension setting, it has almost the same performance
>>> impact with EXTENDED_COLUMN which reduces one dimension, but better
>>> understanding.
>>>
>>> 2016-11-30 19:00 GMT+08:00 Alberto Ramón :
>>>
 This will help you
 http://kylin.apache.org/docs/howto/howto_optimize_cubes.html

 The idea is always, How I can reduce the number of Dimension ?
 If you reduce Dim, the time / resources to build the cube and final
 size of
 it decrease --> Its good

 An example can be DIM_Persons: Id_Person , Name, Surname, Address, .
Id_Person can be HostColumn
 and other columns can be calculated from ID --> are Extended Column




 2016-11-30 11:35 GMT+01:00 仇同心 :

 > Hi ,all
 > I don’t understand the usage scenarios of  EXTENDED_COLUMN,although I
 saw
 > this article “https://issues.apache.org/jira/browse/KYLIN-1313”.
 > What,s the means about parameters of “Host Column” and “Extended
 Column”?
 > Why use this expression,and what aspects of optimization that this
 > expression solved?
 > Can be combined with a SQL statement to explain?
 >
 >
 > Thanks~
 >

>>>
>>>
>>>
>>> --
>>> With Warm regards
>>>
>>> Yiming Liu (刘一鸣)
>>>
>>
>>
>
>
> --
> With Warm regards
>
> Yiming Liu (刘一鸣)
>


Re: User MailList

2016-12-01 Thread Alberto Ramón
Nice ¡¡
Will be very helpfull to find similar problems

2016-12-01 13:31 GMT+01:00 Luke Han :

> already working on that
>
> Get Outlook for iOS 
>
>
>
>
> On Thu, Dec 1, 2016 at 5:15 PM +0800, "Alberto Ramón" <
> a.ramonporto...@gmail.com> wrote:
>
> Small Proposal:
>>
>> Dev mailList is in Nabble (more practical than mail-archives.apache.org:
>> You can find by txt, see pictures and more readable)
>>
>> Is it possible make the same with UserList ?
>>
>> (nowadays, a lot of user's doubts are in Dev MailList or in both)
>>
>