Thanks for you help !
I compare the speed about exact and followingKey like this,is it right?
Scanner scan = conn.createScanner("", new Authorizations());
ListRange list = new ArrayListRange();
for (Map.EntryKey, Value entry : scan) {
if (list.size() == resultNum * threadNum) {
break;
}
Key indexKey = entry.getKey();
Key rowKey = new Key(indexKey.getColumnQualifier());
Text followRow = rowKey.followingKey(PartialKey.ROW).getRow();
list.add(new Range(rowKey.getRow(), followRow));
// list.add(Range.exact(entry.getKey().getColumnQualifier()));
}
scan.close();
But i find that it not have big different,I make the list has 5000 range,and it
cost about 13s when I use it by BatchScanner in two ways.
I change my config in accumulo-site.xml,and now the results=0 is not found.
This is my accumulo-site.xml:
property
nametserver.cache.data.size/name
value4G/value
/property
property
nametserver.cache.index.size/name
value16G/value
/property
property
nametserver.memory.maps.native.enabled/name
valuetrue/value
/property
property
nametserver.metadata.readhead.concurrent.max/name
value65536/value
/property
property
nametserver.readhead.concurrent.max/name
value65536/value
/property
property
nametserver.scan.files.open.max/name
value65536/value
/property
property
nametable.cache.block.enable/name
valuetrue/value
/property
property
nametable.cache.index.enable/name
valuetrue/value
/property
Is it ok?
Thanks
原始邮件
发件人:Josh [email protected]
收件人:[email protected]
发送时间:2015年1月14日(周三) 11:13
主题:Re: 回复:how can i optimize scan speed when use batch scan ?
Thanks! That's very helpful. You probably meant to do the following: Key
indexKey = entry.getKey(); Key rowKey = new Key(indexKey.getColumnQualifier());
Text followingRow = rowKey.followingKey(PartialKey.ROW).getRow(); list.add(new
Range(k.getRow(), followingRow); Range.exact(row) will only match a Key which
has that exact row ID (empty column family and qualifier). The above will match
all keys with the provided row ID (all column families and qualifiers). Does
that make sense (and hopefully work)? 覃璐 wrote: this is the code how I get
the row ids which in ColumnQualify: Scanner scan = conn.createScanner(“t1",
new Authorizations()); ListRange list = new ArrayListRange(); for
(Map.EntryKey, Value entry : scan) { if (list.size() == resultNum *
threadNum) { break; }
list.add(Range.exact(entry.getKey().getColumnQualifier())); } scan.close();
and then I use the row ids to scan data. BatchScanner bs = null; try {
bs = conn.createBatchScanner("test.new_index", new Authorizations(), 10); }
catch (TableNotFoundException e) { e.printStackTrace(); }
bs.setRanges(list); 原始邮件 *发件人:* Josh [email protected] *收件人:*
[email protected] *发送时间:* 2015年1月14日(周三) 10:32 *主题:* Re: 回复:how
can i optimize scan speed when use batch scan ? You might need to set
tserver.cache.data.size to a larger value. Depending on the amount of data,
you might just churn through the cache without getting much benefit. I think
you have to restart Accumulo after changing this property. Can you show us
the code you used to try to scan for a row ID and the data in the table you
expected to be returned that wasn't? 覃璐 wrote: Yes,I received all results
what I want when the program end. But I do not know why the scan received 0
result when I ensure a exists row id? I config the
table.cache.block.enable=true,but I do not found distinct change. Thanks
原始邮件 *发件人:* Eric [email protected] mailto:[email protected]
*收件人:*[email protected]
mailto:[email protected][email protected]
mailto:[email protected] *发送时间:* 2015年1月14日(周三) 00:17 *主题:* Re: 回复:how
can i optimize scan speed when use batch scan ? You should have received at
least 1390 Key/Value pairs (#results=1390). If your application has many
exact RowID look-ups, you may want to investigate Bloom filters. Consider
turning on data block caching to reduce latency on future look-ups. -Eric
On Mon, Jan 12, 2015 at 8:15 PM, 覃璐[email protected] mailto:[email protected]
mailto:[email protected] mailto:[email protected] wrote: i am sorry i do
not know about the image. the log is this: [17:50:38] TRACE
[org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
[org.apache.accumulo.core.util.OpTimer.start(OpTimer.java:39)] [21521] -
tid=65 oid=675 Continuing multi scan, scanid=-152589127623326551 [17:50:38]
TRACE [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
[org.apache.accumulo.core.util.OpTimer.stop(OpTimer.java:49)] [21544] - tid=65
oid=675 Got more multi scan results, #results=1390 scanID=-152589127623326551
in 0.023 secs [17:50:38] TRACE
[org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
[org.apache.accumulo.core.util.OpTimer.start(OpTimer.java:39)] [21546] -
tid=65 oid=676 Continuing multi scan, scanid=-152589127623326551 [17:50:38]
TRACE [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
[org.apache.accumulo.core.util.OpTimer.stop(OpTimer.java:49)] [21555] - tid=45
oid=644 Got more multi scan results, #results=0 scanID=-4477962012178388198 in
1.002 secs [17:50:38] TRACE
[org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
[org.apache.accumulo.core.util.OpTimer.start(OpTimer.java:39)] [21555] -
tid=45 oid=677 Continuing multi scan, scanid=-4477962012178388198 [17:50:38]
TRACE [org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
[org.apache.accumulo.core.util.OpTimer.stop(OpTimer.java:49)] [21596] - tid=57
oid=645 Got more multi scan results, #results=0 scanID=-8718025066902358141 in
1.003 secs [17:50:38] TRACE
[org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator]
[org.apache.accumulo.core.util.OpTimer.start(OpTimer.java:39)] [21596] -
tid=57 oid=678 Continuing multi scan, scanid=-8718025066902358141 the scan
spend long time but has no result. i use 1.6.1,and the config output is
this: default | table.balancer ............................ |
org.apache.accumulo.server.master.balancer.DefaultLoadBalancer default |
table.bloom.enabled ....................... | false default |
table.bloom.error.rate .................... | 0.5% default |
table.bloom.hash.type ..................... | murmur default |
table.bloom.key.functor ................... |
org.apache.accumulo.core.file.keyfunctor.RowFunctor default |
table.bloom.load.threshold ................ | 1 default | table.bloom.size
.......................... | 1048576 default | table.cache.block.enable
.................. | false default | table.cache.index.enable
.................. | true default | table.classpath.context
................... | default | table.compaction.major.everything.idle .... |
1h default | table.compaction.major.ratio .............. | 3 default |
table.compaction.minor.idle ............... | 5m default |
table.compaction.minor.logs.threshold ..... | 3 table | table.constraint.1
........................ |
org.apache.accumulo.core.constraints.DefaultKeySizeConstraint default |
table.failures.ignore ..................... | false default |
table.file.blocksize ...................... | 0B default |
table.file.compress.blocksize ............. | 100K default |
table.file.compress.blocksize.index ....... | 128K default |
table.file.compress.type .................. | gz default | table.file.max
............................ | 15 default | table.file.replication
.................... | 0 default | table.file.type
........................... | rf default | table.formatter
........................... |
org.apache.accumulo.core.util.format.DefaultFormatter default |
table.groups.enabled ...................... | default | table.interepreter
........................ |
org.apache.accumulo.core.util.interpret.DefaultScanInterpreter table |
table.iterator.majc.vers .................. |
20,org.apache.accumulo.core.iterators.user.VersioningIterator table |
table.iterator.majc.vers.opt.maxVersions .. | 1 table |
table.iterator.minc.vers .................. |
20,org.apache.accumulo.core.iterators.user.VersioningIterator table |
table.iterator.minc.vers.opt.maxVersions .. | 1 table |
table.iterator.scan.vers .................. |
20,org.apache.accumulo.core.iterators.user.VersioningIterator table |
table.iterator.scan.vers.opt.maxVersions .. | 1 default |
table.majc.compaction.strategy ............ |
org.apache.accumulo.tserver.compaction.DefaultCompactionStrategy default |
table.scan.max.memory ..................... | 512K default |
table.security.scan.visibility.default .... | default | table.split.threshold
..................... | 1G default | table.walog.enabled
....................... | true and my tablet server is 4 core,32G. Thanks
原始邮件 *发件人:* Josh [email protected] mailto:[email protected]
mailto:[email protected] mailto:[email protected] *收件人:*
[email protected] mailto:[email protected]
mailto:[email protected] mailto:[email protected] *发送时间:*
2015年1月12日(周一) 23:52 *主题:* Re: 回复:how can i optimize scan speed when use batch
scan ? FYI, images don't (typically) come across on the mailing list. Use
some external hosting and provide the link if it's important, please. How
many tabletservers do you have? What version of Accumulo are you running? Can
you share the output of `config -t your_table_name`? Thanks. 覃璐 wrote: i
look the trace log why it receive 0 result and spend so long? 原始邮件
*发件人:* 覃璐[email protected] mailto:[email protected]
mailto:[email protected] mailto:[email protected] *收件人:*
[email protected] mailto:[email protected]
mailto:[email protected] mailto:[email protected] *发送时间:*
2015年1月12日(周一) 17:05 *主题:* how can i optimize scan speed when use batch scan
? hi all. now i have code like this: ListRange rangeList=…..;
BatchScanner bs=conn.createBatchScanner(); bs.setRanges(rangeList); the
rangeList has many ranges about 1000,and every range has a random row id when
i use Range.exact(new Text(…)), but the speed is so slowly,it maybe spend
2-3s,how can i optimize it ? thanks