Re: select * clause still case all regionserver crash

2016-12-01 Thread Alberto Ramón
About select * from   :Kylin 1792
 v1.5.3

Can you check if you see this error: "*Scan row count exceeded threshold:
100, please add filter condition to narrow down backend scan range,
like where clause*"

With Queries like this:

SELECT * FROM TB

There are some limits:

- Read more 1M rows per RS

- Read more 100MB per RD

- Total result of Kylin 3.0 GB (more or less)

- Time limit to solve querie . (I don't remember if it was 10 seconds)


*As resume:* Kylin, can't read Billon of rows, and compute its. Because is
a OLAP Cube and we espect solve queries very very fast (This is the reason
of existence of a Precalculates Cubes)


You must add where condition to Select column from TB Where ...  to make a
predicate pushdown on queries




2016-12-02 3:23 GMT+01:00 alaleiwang :

> KYLIN-1787  say fixed in v1.5.3,but it still happened from v1.5.1 to v1.5.3
>
> --
> View this message in context: http://apache-kylin.74782.x6.
> nabble.com/select-clause-still-cause-all-regionserver-
> crash-tp6474p6476.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>


[jira] [Created] (KYLIN-2245) Refine CubeSegment

2016-12-01 Thread Wang Cheng (JIRA)
Wang Cheng created KYLIN-2245:
-

 Summary: Refine CubeSegment
 Key: KYLIN-2245
 URL: https://issues.apache.org/jira/browse/KYLIN-2245
 Project: Kylin
  Issue Type: Bug
Reporter: Wang Cheng
Priority: Minor


List can not present the relations among CubeSegments, for 
example, there are a lot of operations among Segments in CubeInstance. 

Will refine List with new class Segments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-2244) "kylin.job.cuboid.size.memhungry.ratio" shouldn't be applied on measures like TopN

2016-12-01 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-2244:
---

 Summary: "kylin.job.cuboid.size.memhungry.ratio" shouldn't be 
applied on measures like TopN
 Key: KYLIN-2244
 URL: https://issues.apache.org/jira/browse/KYLIN-2244
 Project: Kylin
  Issue Type: Improvement
Reporter: Shaofeng SHI


The parameter "kylin.job.cuboid.size.memhungry.ratio" (new name " 
kylin.cube.size-estimate-memhungry-ratio") default value 0.05, is based on the 
compression ration on HyperLogLog; it doesn't fit for other memory hungry 
measures like TopN, Raw, etc;  





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-2243) TopN memory estimation is inaccurate in some cases

2016-12-01 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-2243:
---

 Summary: TopN memory estimation is inaccurate in some cases
 Key: KYLIN-2243
 URL: https://issues.apache.org/jira/browse/KYLIN-2243
 Project: Kylin
  Issue Type: Bug
Reporter: Shaofeng SHI
 Fix For: Backlog


TopNCounterSerializer.maxLength() and 
TopNCounterSerializer.getStorageBytesEstimate() might be inaccurate, especially 
when there are multiple "group by" columns in one TopN measure and some uses 
long bytes encoding like "fixed_length:16"

The inaccurate estimation may cause memory issue when using in-mem cubing, and 
will cause the estimation on final cube size inaccurate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: select * clause still case all regionserver crash

2016-12-01 Thread alaleiwang
KYLIN-1787  say fixed in v1.5.3,but it still happened from v1.5.1 to v1.5.3

--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/select-clause-still-cause-all-regionserver-crash-tp6474p6476.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


Re: select * clause still case all regionserver crash

2016-12-01 Thread Alberto Ramón
Sounds like:
KYLIN-1787 

MailList

MailList



2016-12-01 14:04 GMT+01:00 alaleiwang :

> hi:
>i ask the question at
> :http://apache-kylin.74782.x6.nabble.com/some-question-
> about-setMaxResultSize-for-scanner-CubeHBaseScanRPC-
> CubeHBaseEndpointRPC-td4983.html#a5007
>
>https://issues.apache.org/jira/browse/KYLIN-1787 say it solved the
> problem
>
>but i am still now suffering from the same thing,all regionserver(180+)
> crashed from time to time
> by the clause "select * from table",and limit does not help
>
>the relate cube size is about 3.88TB,and my regionserver memory sumed up
> to 2.04T
>
>this happend from kylin 1.5.1 and kylin 1.5.3,i don't test on kylin
> 1.5.4
>
>more to be added:
>select * from tablename where aa=bb  limit 10 do work,and will not crash
> regionserver
>
>
>
> --
> View this message in context: http://apache-kylin.74782.x6.
> nabble.com/select-clause-still-case-all-regionserver-crash-tp6474.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>


select * clause still case all regionserver crash

2016-12-01 Thread alaleiwang
hi:
   i ask the question at
:http://apache-kylin.74782.x6.nabble.com/some-question-about-setMaxResultSize-for-scanner-CubeHBaseScanRPC-CubeHBaseEndpointRPC-td4983.html#a5007
   
   https://issues.apache.org/jira/browse/KYLIN-1787 say it solved the
problem

   but i am still now suffering from the same thing,all regionserver(180+)
crashed from time to time 
by the clause "select * from table",and limit does not help

   the relate cube size is about 3.88TB,and my regionserver memory sumed up
to 2.04T

   this happend from kylin 1.5.1 and kylin 1.5.3,i don't test on kylin 1.5.4
   
   more to be added:
   select * from tablename where aa=bb  limit 10 do work,and will not crash
regionserver
   


--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/select-clause-still-case-all-regionserver-crash-tp6474.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


[jira] [Created] (KYLIN-2242) Directly write hdfs file in reducer is dangerous

2016-12-01 Thread kangkaisen (JIRA)
kangkaisen created KYLIN-2242:
-

 Summary: Directly write hdfs file in reducer is dangerous
 Key: KYLIN-2242
 URL: https://issues.apache.org/jira/browse/KYLIN-2242
 Project: Kylin
  Issue Type: Bug
  Components: Job Engine
Affects Versions: v1.6.0
Reporter: kangkaisen
Assignee: Dong Li


currently, Kylin directly write hdfs file in {{FactDistinctColumnsReducer}}, 
which is dangerous because the MapReduce Speculative Execution will result in 
more than one reducers write the same hdfs file at the same time. 

After KYLIN-2217, I think this issue will occur with higher probability. we 
should  output the value by {{context.wirte}} in reducer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)