What's the status of Accumulo on YARN?

2014-09-08 Thread Jianshi Huang
I heard about it from the AS14 slides, and looks like it will make running our own Accumulo cluster a lot easier for us. Is there a place that I can get the latest scripts and manuals? What's the current status of it? Cheers, -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github Blog

Re: What's the status of Accumulo on YARN?

2014-09-08 Thread Jianshi Huang
, Jianshi Huang jianshi.hu...@gmail.com wrote: I heard about it from the AS14 slides, and looks like it will make running our own Accumulo cluster a lot easier for us. Is there a place that I can get the latest scripts and manuals? What's the current status of it? Cheers, -- Jianshi Huang

Need help (gets Error: Instance has not been configured for AccumuloOutputFormat)

2014-07-18 Thread Jianshi Huang
()) AccumuloOutputFormat.setCreateTables(job, true) AccumuloOutputFormat.setDefaultTableName(job, Conf.getString(accumulo.entity.table)) Any idea why it happened? Cheers, -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github Blog: http://huangjs.github.com/

Re: How does Accumulo compare to HBase

2014-07-17 Thread Jianshi Huang
#rows fixed? -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github Blog: http://huangjs.github.com/

Re: How does Accumulo compare to HBase

2014-07-17 Thread Jianshi Huang
, Jul 17, 2014 at 4:09 PM, Ted Yu yuzhih...@gmail.com wrote: W.r.t. HBase filter's performance, can you let us know the source of the information ? Is performance bad for all filters or some types of filters ? Cheers On Jul 16, 2014, at 11:48 PM, Jianshi Huang jianshi.hu...@gmail.com wrote

Forgot SECRET, how to delete zookeeper nodes?

2014-07-13 Thread Jianshi Huang
Clusters got updated and user home files lost... I tried to reinstall accumulo but I forgot the secret I put before. So how can I delete /accumulo in Zookeeper? Or is there a way to rename instance_id? -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github Blog: http

Re: Forgot SECRET, how to delete zookeeper nodes?

2014-07-13 Thread Jianshi Huang
it recently. I would not recommend to delete the information in zookeeper unless there is not other option, you may loose the data IMO. On Mon, Jul 14, 2014 at 8:40 AM, Jianshi Huang jianshi.hu...@gmail.com wrote: Clusters got updated and user home files lost... I tried to reinstall accumulo

Re: Forgot SECRET, how to delete zookeeper nodes?

2014-07-13 Thread Jianshi Huang
at 9:06 AM, Jianshi Huang jianshi.hu...@gmail.com wrote: It's too deleted... so the only option I have is to delete the zookeeper nodes and reinitialize accumulo. You're right, I deleted the zk nodes and now Accumulo complains nonode error. Can I recover the tables for a new instance

Re: REST server for the D4M Schema

2014-06-30 Thread Jianshi Huang
to return. If you have a feature that you want in the REST server just let me know and I'll consider implementing it. Also feel free to fork the project and add your own functionality. -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github Blog: http://huangjs.github.com/

Fwd: How does Accumulo compare to HBase

2014-06-25 Thread Jianshi Huang
have currently is reverse scan. https://issues.apache.org/jira/browse/HBASE-4811 I already found a use case in my prototype! Jianshi On Wed, Jun 25, 2014 at 2:04 AM, Jianshi Huang jianshi.hu...@gmail.com wrote: Thank you David! Will do. On Wed, Jun 25, 2014 at 2:02 AM, David Medinets

Re: How does Accumulo compare to HBase

2014-06-25 Thread Jianshi Huang
. - production user: production - beta testing user: beta production - alpha testing user: alpha beta production BTW, will they be counted as same record with different version? Or different records? Does that make sense? Jianshi On Wed, Jun 25, 2014 at 3:51 PM, Jianshi Huang jianshi.hu...@gmail.com

Re: How does Accumulo compare to HBase

2014-06-25 Thread Jianshi Huang
Ah I see. Then I need to control versioning myself. A customized versioning iterator aware of a/b/prod labels? Maybe there's a better way to do it. Jianshi On Wed, Jun 25, 2014 at 4:19 PM, Sean Busbey bus...@cloudera.com wrote: On Wed, Jun 25, 2014 at 2:52 AM, Jianshi Huang jianshi.hu

Re: How does Accumulo compare to HBase

2014-06-25 Thread Jianshi Huang
in every environment. On Wed, Jun 25, 2014 at 5:30 AM, Jianshi Huang jianshi.hu...@gmail.com wrote: Ah I see. Then I need to control versioning myself. A customized versioning iterator aware of a/b/prod labels? Maybe there's a better way to do it. Jianshi On Wed, Jun 25, 2014

Re: How does Accumulo compare to HBase

2014-06-25 Thread Jianshi Huang
:30 AM, Jianshi Huang jianshi.hu...@gmail.com wrote: Ah I see. Then I need to control versioning myself. A customized versioning iterator aware of a/b/prod labels? Maybe there's a better way to do it. Jianshi On Wed, Jun 25, 2014 at 4:19 PM, Sean Busbey bus...@cloudera.com wrote

Re: How does Accumulo compare to HBase

2014-06-24 Thread Jianshi Huang
and none in the peer-reviewed literature. Old data suggests that HBase performance is ~1% of Accumulo performance. In short, one can often replace a 20+ node database with a single node Accumulo database. On Tue, Jun 24, 2014 at 01:55:54AM +0800, Jianshi Huang wrote: Er... basically I need

Re: How does Accumulo compare to HBase

2014-06-24 Thread Jianshi Huang
? Cheers On Tue, Jun 24, 2014 at 7:29 AM, Jianshi Huang jianshi.hu...@gmail.com wrote: Hi David, I did, it's a wonderful piece of work and for reviewing facts in a networks it's a great tool. (And Lumify looks really nice) However, my queries are mostly time-bound (from time A to time B

Re: How does Accumulo compare to HBase

2014-06-24 Thread Jianshi Huang
+Update: Possibly 100s Billion of columns. On Wed, Jun 25, 2014 at 12:03 AM, Jianshi Huang jianshi.hu...@gmail.com wrote: Hi Ted, CF: maybe dozens Columns: billions (rowkey = nodeId, CF = event type, CQ = Index+eventId) Make sense? Jianshi On Tue, Jun 24, 2014 at 10:33 PM, Ted Yu

Re: How does Accumulo compare to HBase

2014-06-24 Thread Jianshi Huang
Ted: +1.5B columns - 5 CF - 300M CQ Jianshi On Wed, Jun 25, 2014 at 1:50 AM, Ted Yu yuzhih...@gmail.com wrote: Thanks for the update. In your experiment so far, how many columns were involved ? Cheers On Tue, Jun 24, 2014 at 10:44 AM, Jianshi Huang jianshi.hu...@gmail.com wrote

How does Accumulo compare to HBase

2014-06-23 Thread Jianshi Huang
Er... basically I need to explain to my manager why choosing Accumulo, instead of HBase. So what are the pros and cons of Accumulo vs. HBase? (btw HBase 0.98 also got cell-level security, modeled after Accumulo) -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github Blog: http

Re: MutationsRejectedException and TApplicationException

2014-06-18 Thread Jianshi Huang
, you should be able to see a nice HTML view of any errors 2) Check the debug log, e.g. $ACCUMULO_HOME/logs/tserver_$host.debug.log. If you're running tservers on more than one node, be sure that you check the log files on all nodes. - Josh On 6/17/14, 9:33 PM, Jianshi Huang wrote: Hi, I

Set AccumuloFileOutputFormat to save data to HDFS files instead of writing to Accumulo directly

2014-06-18 Thread Jianshi Huang
and partitioned, makes sense? -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github Blog: http://huangjs.github.com/

Re: MutationsRejectedException and TApplicationException

2014-06-18 Thread Jianshi Huang
) at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleWrite(AbstractNonblockingServer.java:220) at org.apache.thrift.server.TNonblockingServer$SelectAcceptThread.select(TNonblockingServer.java:201) Jianshi On Wed, Jun 18, 2014 at 2:54 PM, Jianshi Huang jianshi.hu...@gmail.com wrote: I see

Re: Set AccumuloFileOutputFormat to save data to HDFS files instead of writing to Accumulo directly

2014-06-18 Thread Jianshi Huang
#listSplits) to determine how to partition the output among files. Note that you don't write a Text, Mutation pair, but a Key, Value pair to the files. On Wed, Jun 18, 2014 at 4:21 AM, Jianshi Huang jianshi.hu...@gmail.com wrote: Hi all, I saw this line in accumulo-1.6.0/examples/simple/src

Re: MutationsRejectedException and TApplicationException

2014-06-18 Thread Jianshi Huang
of resources does accumulo have? On Wed, Jun 18, 2014 at 7:09 AM, Jianshi Huang jianshi.hu...@gmail.com mailto:jianshi.hu...@gmail.com wrote: Here's the error message I got from the tserver_xxx.log 2014-06-18 01:06:06,816 [tserver.TabletServer] INFO : Adding 1 logs

Re: MutationsRejectedException and TApplicationException

2014-06-18 Thread Jianshi Huang
wrote: On Wed, Jun 18, 2014 at 12:51 PM, Jianshi Huang jianshi.hu...@gmail.com wrote: Oh, this memory size: tserver.memory.maps.max 1G - 20G (looks like this is an overkill, is it?) Probably. If you have a spare 20G, though... :-) tserver.cache.data.size 128M? - 1024M

Re: MutationsRejectedException and TApplicationException

2014-06-18 Thread Jianshi Huang
tserver debug log to see how much of the JVM memory you are actually using. -Eric On Wed, Jun 18, 2014 at 1:04 PM, Jianshi Huang jianshi.hu...@gmail.com wrote: I see. thank you Josh and Eric. BTW, here's my current JVM memory settings: -Xmx32g -Xms4g -XX:NewSize=2G -XX:MaxNewSize=2G

Re: MutationsRejectedException and TApplicationException

2014-06-18 Thread Jianshi Huang
Just want to correct that -Xmx32g won't enable compressed pointers. Set it to -Xmx31g Jianshi On Thu, Jun 19, 2014 at 1:20 AM, Jianshi Huang jianshi.hu...@gmail.com wrote: I see. The native map was enabled already. I think I understand better now how Accumulo uses my memory. So I increased

Re: Slides for Accumulo Summit 2014

2014-06-18 Thread Jianshi Huang
On Wed, Jun 18, 2014 at 10:48 AM, Jianshi Huang jianshi.hu...@gmail.com wrote: I'm wondering when the slides from the summit will be published to public. I just started using Accumulo and wasn't able to attend the conf. :) Cheers, -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang

Re: Fetch rows in reversed order and how to model time range for quick fetching

2014-06-17 Thread Jianshi Huang
/org/apache/accumulo/ core/client/lexicoder/ReverseLexicoder.html [2] http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/ core/client/lexicoder/DateLexicoder.html On 6/16/14, 10:02 PM, Jianshi Huang wrote: Hi all, I'm thinking about storing payments in the following format: rowId

Re: Need help. Error: java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.encodeBase64String

2014-06-16 Thread Jianshi Huang
PM, Jianshi Huang jianshi.hu...@gmail.com wrote: Hi, I'm trying to use Accumulo with Spark writing to AccumuloOutputFormat. I got the following errors in my spark app log: 14/06/16 02:01:44 INFO cluster.YarnClientClusterScheduler: YarnClientClusterScheduler.postStartHook done Exception

Fetch rows in reversed order and how to model time range for quick fetching

2014-06-16 Thread Jianshi Huang
the last payment, so can I scan the table using a reversed range? Also I'd like to know if point-in-time status data can be modeled in a similar fashion, or should I take advantage of the timestamp column. Cheers, -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github Blog: http