Re: 回复：how can i optimize scan speed when use batch scan ?

Josh Elser Tue, 13 Jan 2015 18:33:07 -0800

You might need to set tserver.cache.data.size to a larger value.Depending on the amount of data, you might just churn through the cachewithout getting much benefit. I think you have to restart Accumulo afterchanging this property.

Can you show us the code you used to try to scan for a row ID and thedata in the table you expected to be returned that wasn't?


覃璐 wrote:

Yes,I received all results what I want when the program end.

But I do not know why the scan received 0 result when I ensure a exists
row id?

I config the table.cache.block.enable=true,but I do not found distinct
change.

Thanks


原始邮件
*发件人:* Eric Newton<[email protected]>
*收件人:* [email protected]<[email protected]>
*发送时间:* 2015年1月14日(周三) 00:17
*主题:* Re: 回复：how can i optimize scan speed when use batch scan ?

You should have received at least 1390 Key/Value pairs (#results=1390).

If your application has many exact RowID look-ups, you may want to
investigate Bloom filters.

Consider turning on data block caching to reduce latency on future look-ups.

-Eric


On Mon, Jan 12, 2015 at 8:15 PM, 覃璐 <[email protected]
<mailto:[email protected]>> wrote:

    i am sorry i do not know about the image.

    the log is this:


    [17:50:38] TRACE
    [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
    [org.apache.accumulo.core.util.OpTimer.start(OpTimer.java:39)]
    [21521] - tid=65 oid=675 Continuing multi scan,
    scanid=-152589127623326551

    [17:50:38] TRACE
    [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
    [org.apache.accumulo.core.util.OpTimer.stop(OpTimer.java:49)]
    [21544] - tid=65 oid=675 Got more multi scan results, #results=1390
    scanID=-152589127623326551 in 0.023 secs

    [17:50:38] TRACE
    [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
    [org.apache.accumulo.core.util.OpTimer.start(OpTimer.java:39)]
    [21546] - tid=65 oid=676 Continuing multi scan,
    scanid=-152589127623326551

    [17:50:38] TRACE
    [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
    [org.apache.accumulo.core.util.OpTimer.stop(OpTimer.java:49)]
    [21555] - tid=45 oid=644 Got more multi scan results, #results=0
    scanID=-4477962012178388198 in 1.002 secs

    [17:50:38] TRACE
    [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
    [org.apache.accumulo.core.util.OpTimer.start(OpTimer.java:39)]
    [21555] - tid=45 oid=677 Continuing multi scan,
    scanid=-4477962012178388198

    [17:50:38] TRACE
    [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
    [org.apache.accumulo.core.util.OpTimer.stop(OpTimer.java:49)]
    [21596] - tid=57 oid=645 Got more multi scan results, #results=0
    scanID=-8718025066902358141 in 1.003 secs

    [17:50:38] TRACE
    [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
    [org.apache.accumulo.core.util.OpTimer.start(OpTimer.java:39)]
    [21596] - tid=57 oid=678 Continuing multi scan,
    scanid=-8718025066902358141


    the scan spend long time but has no result.


    i use 1.6.1,and the config output is this:


    default | table.balancer ............................ |
    org.apache.accumulo.server.master.balancer.DefaultLoadBalancer

    default | table.bloom.enabled ....................... | false

    default | table.bloom.error.rate .................... | 0.5%

    default | table.bloom.hash.type ..................... | murmur

    default | table.bloom.key.functor ................... |
    org.apache.accumulo.core.file.keyfunctor.RowFunctor

    default | table.bloom.load.threshold ................ | 1

    default | table.bloom.size .......................... | 1048576

    default | table.cache.block.enable .................. | false

    default | table.cache.index.enable .................. | true

    default | table.classpath.context ................... |

    default | table.compaction.major.everything.idle .... | 1h

    default | table.compaction.major.ratio .............. | 3

    default | table.compaction.minor.idle ............... | 5m

    default | table.compaction.minor.logs.threshold ..... | 3

    table | table.constraint.1 ........................ |
    org.apache.accumulo.core.constraints.DefaultKeySizeConstraint

    default | table.failures.ignore ..................... | false

    default | table.file.blocksize ...................... | 0B

    default | table.file.compress.blocksize ............. | 100K

    default | table.file.compress.blocksize.index ....... | 128K

    default | table.file.compress.type .................. | gz

    default | table.file.max ............................ | 15

    default | table.file.replication .................... | 0

    default | table.file.type ........................... | rf

    default | table.formatter ........................... |
    org.apache.accumulo.core.util.format.DefaultFormatter

    default | table.groups.enabled ...................... |

    default | table.interepreter ........................ |
    org.apache.accumulo.core.util.interpret.DefaultScanInterpreter

    table | table.iterator.majc.vers .................. |
    20,org.apache.accumulo.core.iterators.user.VersioningIterator

    table | table.iterator.majc.vers.opt.maxVersions .. | 1

    table | table.iterator.minc.vers .................. |
    20,org.apache.accumulo.core.iterators.user.VersioningIterator

    table | table.iterator.minc.vers.opt.maxVersions .. | 1

    table | table.iterator.scan.vers .................. |
    20,org.apache.accumulo.core.iterators.user.VersioningIterator

    table | table.iterator.scan.vers.opt.maxVersions .. | 1

    default | table.majc.compaction.strategy ............ |
    org.apache.accumulo.tserver.compaction.DefaultCompactionStrategy

    default | table.scan.max.memory ..................... | 512K

    default | table.security.scan.visibility.default .... |

    default | table.split.threshold ..................... | 1G

    default | table.walog.enabled ....................... | true


    and my tablet server is 4 core,32G.


    Thanks


    原始邮件
    *发件人:* Josh Elser<[email protected] <mailto:[email protected]>>
    *收件人:* user<[email protected]
    <mailto:[email protected]>>
    *发送时间:* 2015年1月12日(周一) 23:52
    *主题:* Re: 回复：how can i optimize scan speed when use batch scan ?

    FYI, images don't (typically) come across on the mailing list. Use some
    external hosting and provide the link if it's important, please.

    How many tabletservers do you have? What version of Accumulo are you
    running? Can you share the output of `config -t your_table_name`?

    Thanks.

    覃璐 wrote:
    >  i look the trace log
    >
    >
    >  why it receive 0 result and spend so long?
    >
    >
    >  原始邮件
    >  *发件人:* 覃璐<[email protected]  <mailto:[email protected]>>
    >  *收件人:* user<[email protected]  <mailto:[email protected]>>
    >  *发送时间:* 2015年1月12日(周一) 17:05
    >  *主题:* how can i optimize scan speed when use batch scan ?
    >
    >  hi all.
    >
    >  now i have code like this:
    >
    >  List<Range>  rangeList=…..;
    >  BatchScanner bs=conn.createBatchScanner();
    >  bs.setRanges(rangeList);
    >
    >
    >  the rangeList has many ranges about 1000,and every range has a random
    >  row id when i use Range.exact(new Text(…)),
    >  but the speed is so slowly,it maybe spend 2-3s,how can i optimize it ?
    >
    >  thanks

Re: 回复：how can i optimize scan speed when use batch scan ?

Reply via email to