Hi Dong, could you give some numbers on |A|, |B|, and the latency of the
two queries?

Observation is usually the performance is IO bound. However for the case
you mentioned, the queries have the same scan range, which means the same
level of IO. So they SHOULD return in similar latency. Could it be hot/cold
cache?

The evaluation of IN is done by a set, so the size shall not matter.

On Mon, Mar 16, 2015 at 5:11 PM, dong wang <[email protected]> wrote:

> hi shaofeng, one more question, suppose that we have SQL like: select *
> from t1 where col1 in {A} group by..
>
> set {A} is a subset of set {C}, say {B} = {C} - {A}, then {B} is
> the difference set,  and suppose that |A| > |B|,  |A| indicates
> A's cardinality, then tests show the performance of select * from t1 where
> col1 NOT IN {B} group by.. is much better that the performance of select *
> from t1 where col1 IN {A} group by..  especially when  |A| is much greater
> than |B|, so, could anyone please have a explain?
>
> |
>
> 2015-03-16 16:11 GMT+08:00 Shi, Shaofeng <[email protected]>:
>
> > Awesome! Thanks for the sharing, we will compose these tips into wiki for
> > all Kylin users;
> >
> > On 3/16/15, 4:03 PM, "dong wang" <[email protected]> wrote:
> >
> > >it works, thanks~
> > >
> > >adding more, if someone deploys kylin on linux server but wants to debug
> > >kylin from Windows' source code in eclipse, one can just add the
> following
> > >codes in kylin.sh, and then start eclipse java remote debug feature~
> > >
> > >if [ -z "$JPDA_TRANSPORT" ]; then
> > >    JPDA_TRANSPORT="dt_socket"
> > >fi
> > >if [ -z "$JPDA_ADDRESS" ]; then
> > >    JPDA_ADDRESS="8084"      #remote debug port
> > >fi
> > >if [ -z "$JPDA_SUSPEND" ]; then
> > >    JPDA_SUSPEND="n"
> > >fi
> > >if [ -z "$JPDA_OPTS" ]; then
> > >
> >
> >JPDA_OPTS="-agentlib:jdwp=transport=$JPDA_TRANSPORT,address=$JPDA_ADDRESS,
> > >server=y,suspend=$JPDA_SUSPEND"
> > >fi
> > >
> > >export  JAVA_OPTS="-Xms2048M -Xmx2048M -Xss4M"
> > >
> > >
> > >hbase ${JAVA_OPTS} $JPDA_OPTS \
> > >-Djava.util.logging.config.file=${tomcat_root}/conf/logging.properties \
> > >
> > >2015-03-16 15:43 GMT+08:00 Shi, Shaofeng <[email protected]>:
> > >
> > >> Seems the JVM default stack size couldn¹t fulfill your need; You can
> try
> > >> to give a bigger stack size when start Kylin:
> > >>
> > >> in kylin.sh:
> > >>
> > >> export  JAVA_OPTS="-Xms2048M -Xmx2048M -Xss4M"
> > >>
> > >>         hbase  ${JAVA_OPTS}
> > >>
> -Djava.util.logging.config.file=${CATALINA_HOME}/conf/logging.properties
> > >> -Djava.library.path=${KYLIN_LD_LIBRARY_PATH}
> > >> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
> > >> -Dorg.apache.tomcat.util.buf.UDecoder.ALLOW_ENCODED_SLASH=true
> > >> -Dorg.apache.catalina.connector.CoyoteAdapter.ALLOW_BACKSLASH=true
> > >> -Dspring.profiles.active=sandbox
> > >> -Djava.endorsed.dirs=${CATALINA_HOME}/endorsed
> > >> -Dcatalina.base=${CATALINA_HOME} -Dcatalina.home=${CATALINA_HOME}
> > >> -Djava.io.tmpdir=${CATALINA_HOME}/temp  org.apache.hadoop.util.RunJar
> > >> ${CATALINA_HOME}/bin/bootstrap.jar
> > >>org.apache.catalina.startup.Bootstrap
> > >> start > ${CATALINA_HOME}/logs/kylin.log 2>&1 &
> > >>
> > >>
> > >> Please let us know whether it can solve the problem, so that it can
> help
> > >> other guys;
> > >>
> > >> On 3/16/15, 3:25 PM, "dong wang" <[email protected]> wrote:
> > >>
> > >> >when there are about 3500 values in the "select ...from.. where
> > >>..in(v1,
> > >> >v2, ...v3500)" syntax, it throw the error below,  I'm debugging it
> now,
> > >> >since the problem is urgent,  thus, hope that someone can also do a
> > >>help
> > >> >to
> > >> >take a look?
> > >> >
> > >> >in addition, if the only a few values(for example, 10 values, and
> > >>etc), it
> > >> >can return correct result!
> > >> >
> > >> >[[http-bio-8080-exec-7]:[2015-03-16
> > >>
> >
> >>>15:17:19,301][INFO][org.apache.kylin.query.routing.RoutingRule.applyRule
> > >>>s(
> > >> >RoutingRule.java:60)]
> > >> >- ===================================================
> > >> >[http-bio-8080-exec-7]:[2015-03-16
> > >>
> >
> >>>15:17:19,301][INFO][org.apache.kylin.query.routing.QueryRouter.selectRea
> > >>>li
> > >> >zation(QueryRouter.java:54)]
> > >> >- The realizations remaining:
> > >> >[http-bio-8080-exec-7]:[2015-03-16
> > >>
> >
> >>>15:17:19,301][INFO][org.apache.kylin.query.routing.QueryRouter.selectRea
> > >>>li
> > >> >zation(QueryRouter.java:55)]
> > >> >- [table1]
> > >> >[http-bio-8080-exec-7]:[2015-03-16
> > >>
> >
> >>>15:17:19,301][INFO][org.apache.kylin.query.routing.QueryRouter.selectRea
> > >>>li
> > >> >zation(QueryRouter.java:56)]
> > >> >- The realization being chosen: table1
> > >> >[http-bio-8080-exec-7]:[2015-03-16
> > >>
> > >>>15:17:22,589][ERROR][
> > org.apache.kylin.rest.controller.QueryController.do
> > >>>Qu
> > >> >ery(QueryController.java:226)]
> > >> >- Exception when execute sql
> > >> >java.lang.StackOverflowError
> > >> >    at
> > >>
> >
> >>>org.codehaus.janino.UnitCompiler.getConstantValue2(UnitCompiler.java:445
> > >>>7)
> > >> >    at
> > >>org.codehaus.janino.UnitCompiler.access$8900(UnitCompiler.java:183)
> > >> >    at
> > >>
> >
> >>>org.codehaus.janino.UnitCompiler$11.visitBinaryOperation(UnitCompiler.ja
> > >>>va
> > >> >:4371)
> > >> >    at
> org.codehaus.janino.Java$BinaryOperation.accept(Java.java:3768)
> > >>
> > >>
> >
> >
>

Reply via email to