Hi Dave,
Thanks for your suggestion!
Actually the table is trafodion table which is only named with _HIVE.
For your 3 steps
1. prepare
It would take 2 seconds
2. execute first time
It would take 78 seconds. Here to start all the ESPs would take less
than 1 second
3. execute the second time
It would take 3 seconds.
So I am wondering what does ESP do during it was lauched?
Thanks
Joshua
-----邮件原件-----
发件人: Dave Birdsall [mailto:[email protected]]
发送时间: 2017年12月31日 8:42
收件人: [email protected]; [email protected]
主题: RE: how would esp do when it was launched?
Hi,
How big is the table? How many esps are we creating? Perhaps we are creating
the esps serially; maybe that is what is taking the time.
Another factor to look at is compile time. You can separate that out as follows:
Step 1: using trafci, do a "prepare" of your query. See how long that takes.
Step 2: then execute the query. How long does that take?
Step 3: re-execute the query. How long does that take?
I expect part of that 80 seconds will be consumed in step 1 as compile time.
Would be interesting to know if compile time is, say, 2 seconds or 78 seconds.
If the latter, perhaps the issue is how we read statistics for a Hive table
with many partitions.
Dave
-----Original Message-----
From: Liu, Yao-Hua (Joshua) [mailto:[email protected]]
Sent: Friday, December 22, 2017 1:01 AM
To: [email protected]
Subject: how would esp do when it was launched?
Hi all,
Suresh and I found some interesting thing when run some queries.
Step 1:
Use trafci, run query: select count(*) from CELL_INDICATOR_HIVE where
starttime=20170801000000000; // CELL_INDICATOR_HIVE has 100 billion rows and
each starttime would have 4346483 rows. Starttime is the first column in store
by keys
This would take about 1 minute and 20 seconds to finish.
Step2
Run above sql again, then it would take 3 seconds to finish.
Here 80s vs 3s, we may guess it's due to esp start time or cache. But we
checked,
1. to start all the esps would take less than 1 seconds.
2. If due to cache, we can run another table for a test:
Step3
Run another query: select count(*) from SERVERIP_INDICATOR_BAK where
starttime=20170801000000000; // SERVERIP_INDICATOR_BAK has 64 billion rows and
each starttime would have 2.8 million rows. Starttime is also the first column
in store by keys. Then it would take 2 seconds to finish.
By the way, if we start another trafci(not the same mxosrvr from above)
and run above select count(*) from SERVERIP_INDICATOR_BAK where
starttime=20170801000000000, it would also take 1 minute or more.
So we are wondering what does esp do when it was started? Why the first
time the esp to scan one table would take so much time but the second time to
scan another table could be much faster?
Thanks
Joshua