RE: Running Accumulo straight from Memory

2012-09-13 Thread Moore, Matthew J.
You guys are confirming what we predicted.  This was a like to have
from our customer and we wanted to see if anyone else had tried this.
Thanks.

 

Matt

 

 

 

From: user-return-1336-MATTHEW.J.MOORE=saic@accumulo.apache.org
[mailto:user-return-1336-MATTHEW.J.MOORE=saic@accumulo.apache.org]
On Behalf Of Adam Fuchs
Sent: Wednesday, September 12, 2012 5:28 PM
To: user@accumulo.apache.org
Subject: Re: Running Accumulo straight from Memory

 

Yes, the effect of locality groups would be about the same in an in
memory system. The only exception would be if you're not using locality
groups and are fetching a particular column, the automatic seeking
behavior of the column filtering iterator would be more efficient with
in memory rfiles.

Adam

On Sep 12, 2012 5:20 PM, David Medinets david.medin...@gmail.com
wrote:

Why would locality groups be useful in an in-memory system?

On Wed, Sep 12, 2012 at 4:53 PM, Adam Fuchs afu...@apache.org wrote:
 Even if you are just using memory, minor and major compactions are
important
 to get compression, handle deletes, get sequential access (cache line
 efficiency), use iterators, and introduce locality groups.



Re: Running Accumulo straight from Memory

2012-09-13 Thread Keith Turner
On Wed, Sep 12, 2012 at 5:20 PM, David Medinets
david.medin...@gmail.com wrote:
 Why would locality groups be useful in an in-memory system?

Memory is fast, yet we still organize data in memory to make it really
fast (e.g. hash maps, sorted maps, bloom filters, etc)   Locality
groups are no different.  If using that data organization will make
what you are attempting to do faster, then you would probably use it.
Assume you have two locality groups and one contains 1% of your data
by volume and the other 99%.Scanning just the locality group with
1% of the data will be faster than not having locality groups.  It
cuts down on the amount of data you have to read and processes from
memory.


 On Wed, Sep 12, 2012 at 4:53 PM, Adam Fuchs afu...@apache.org wrote:
 Even if you are just using memory, minor and major compactions are important
 to get compression, handle deletes, get sequential access (cache line
 efficiency), use iterators, and introduce locality groups.


RE: Running Accumulo straight from Memory

2012-09-12 Thread Moore, Matthew J.
Adam,

It does look like we are the first to try this.  We are trying to keep
everything in memory and as a result there is no minor compactions, and
probably major compactions to make tables larger.  We tried this on SSDs
using a file system and we were not getting the processing speeds that
we had wanted.

 

Matt

 

 

From: user-return-1330-MATTHEW.J.MOORE=saic@accumulo.apache.org
[mailto:user-return-1330-MATTHEW.J.MOORE=saic@accumulo.apache.org]
On Behalf Of Adam Fuchs
Sent: Tuesday, September 11, 2012 5:30 PM
To: user@accumulo.apache.org
Subject: Re: Running Accumulo straight from Memory

 

Matthew,

 

I don't know of anyone who has done this, but I believe you could:

1. mount a RAM disk

2. point the hdfs core-site.xml fs.default.name
http://fs.default.name/  property to file:///

3. point the accumulo-site.xml instance.dfs.dir property to a directory
on the RAM disk

4. disable the WAL for all tables by setting the accumulo-site.xml
table.walog.enabled to false

5. initialize and start up accumulo as you regularly would and cross
your fingers

Of course, the you may lose data and this is not an officially
supported configuration caveats apply. Out of curiosity, what would you
be trying to accomplish with this configuration?

 

Adam

 

 

On Tue, Sep 11, 2012 at 12:02 PM, Moore, Matthew J.
matthew.j.mo...@saic.com wrote:

Has anyone run Accumulo on a single server straight from memory?
Probably using something like a Fusion  IO drive.  We are trying to use
it without using an SSD or any spinning discs.

 

Matthew Moore

Systems Engineer

SAIC, ISBU

Columbia, MD

410-312-2542

 

 



Re: Running Accumulo straight from Memory

2012-09-12 Thread dlmarion


Matt, 



  Did you see Eric Newton's response yesterday? Running on a ram disk has been 
done; however minor and major compactions will still occur. 



- Dave 



- Original Message -


From: Matthew J. Moore matthew.j.mo...@saic.com 
To: user@accumulo.apache.org 
Sent: Wednesday, September 12, 2012 12:32:31 PM 
Subject: RE: Running Accumulo straight from Memory 




Adam, 

It does look like we are the first to try this.  We are trying to keep 
everything in memory and as a result there is no minor compactions, and 
probably major compactions to make tables larger.  We tried this on SSDs using 
a file system and we were not getting the processing speeds that we had wanted. 

  

Matt 

  

  


From: user-return-1330-MATTHEW.J.MOORE=saic@accumulo.apache.org 
[mailto:user-return-1330-MATTHEW.J.MOORE=saic@accumulo.apache.org] On 
Behalf Of Adam Fuchs 
Sent: Tuesday, September 11, 2012 5:30 PM 
To: user@accumulo.apache.org 
Subject: Re: Running Accumulo straight from Memory 

  

Matthew, 


  


I don't know of anyone who has done this, but I believe you could: 


1. mount a RAM disk 


2. point the hdfs core-site.xml  fs.default.name  property to file:/// 


3. point the accumulo-site.xml instance.dfs.dir property to a directory on the 
RAM disk 


4. disable the WAL for all tables by setting the accumulo-site.xml 
table.walog.enabled to false 


5. initialize and start up accumulo as you regularly would and cross your 
fingers 

Of course, the you may lose data and this is not an officially supported 
configuration caveats apply. Out of curiosity, what would you be trying to 
accomplish with this configuration? 


  


Adam 


  

  


On Tue, Sep 11, 2012 at 12:02 PM, Moore, Matthew J.  matthew.j.mo...@saic.com 
 wrote: 



Has anyone run Accumulo on a single server straight from memory?  Probably 
using something like a Fusion  IO drive.  We are trying to use it without using 
an SSD or any spinning discs. 

  

Matthew Moore 

Systems Engineer 

SAIC, ISBU 

Columbia, MD 

410-312-2542 

  

 

RE: Running Accumulo straight from Memory

2012-09-12 Thread Adam Fuchs
Even if you are just using memory, minor and major compactions are
important to get compression, handle deletes, get sequential access (cache
line efficiency), use iterators, and introduce locality groups.

Adam
On Sep 12, 2012 12:33 PM, Moore, Matthew J. matthew.j.mo...@saic.com
wrote:

 Adam,

 It does look like we are the first to try this.  We are trying to keep
 everything in memory and as a result there is no minor compactions, and
 probably major compactions to make tables larger.  We tried this on SSDs
 using a file system and we were not getting the processing speeds that we
 had wanted.

 ** **

 Matt

 ** **

 ** **

 *From:* user-return-1330-MATTHEW.J.MOORE=saic@accumulo.apache.org[mailto:
 user-return-1330-MATTHEW.J.MOORE=saic@accumulo.apache.org] *On Behalf
 Of *Adam Fuchs
 *Sent:* Tuesday, September 11, 2012 5:30 PM
 *To:* user@accumulo.apache.org
 *Subject:* Re: Running Accumulo straight from Memory

 ** **

 Matthew,

 ** **

 I don't know of anyone who has done this, but I believe you could:

 1. mount a RAM disk

 2. point the hdfs core-site.xml fs.default.name property to file:///

 3. point the accumulo-site.xml instance.dfs.dir property to a directory on
 the RAM disk

 4. disable the WAL for all tables by setting the accumulo-site.xml
 table.walog.enabled to false

 5. initialize and start up accumulo as you regularly would and cross your
 fingers

 Of course, the you may lose data and this is not an officially
 supported configuration caveats apply. Out of curiosity, what would you be
 trying to accomplish with this configuration?

 ** **

 Adam

 ** **

 ** **

 On Tue, Sep 11, 2012 at 12:02 PM, Moore, Matthew J. 
 matthew.j.mo...@saic.com wrote:

 Has anyone run Accumulo on a single server straight from memory?  Probably
 using something like a Fusion  IO drive.  We are trying to use it without
 using an SSD or any spinning discs.

  

 *Matthew Moore*

 Systems Engineer

 SAIC, ISBU

 Columbia, MD

 410-312-2542

  

 ** **



Re: Running Accumulo straight from Memory

2012-09-12 Thread David Medinets
Why would locality groups be useful in an in-memory system?

On Wed, Sep 12, 2012 at 4:53 PM, Adam Fuchs afu...@apache.org wrote:
 Even if you are just using memory, minor and major compactions are important
 to get compression, handle deletes, get sequential access (cache line
 efficiency), use iterators, and introduce locality groups.


Running Accumulo straight from Memory

2012-09-11 Thread Moore, Matthew J.
Has anyone run Accumulo on a single server straight from memory?
Probably using something like a Fusion  IO drive.  We are trying to use
it without using an SSD or any spinning discs.

 

Matthew Moore

Systems Engineer

SAIC, ISBU

Columbia, MD

410-312-2542

 



Re: Running Accumulo straight from Memory

2012-09-11 Thread Eric Newton
I have run a small cluster with HDFS writing only to a RAM disk.  Is that
the sort of thing you are interested in?

-Eric

On Tue, Sep 11, 2012 at 12:02 PM, Moore, Matthew J. 
matthew.j.mo...@saic.com wrote:

 Has anyone run Accumulo on a single server straight from memory?  Probably
 using something like a Fusion  IO drive.  We are trying to use it without
 using an SSD or any spinning discs.

 ** **

 *Matthew Moore*

 Systems Engineer

 SAIC, ISBU

 Columbia, MD

 410-312-2542

 ** **



Re: Running Accumulo straight from Memory

2012-09-11 Thread William Slacum
Woops- slow innurnet and didn't notice Eric's response.

On Tue, Sep 11, 2012 at 9:30 AM, William Slacum 
wilhelm.von.cl...@accumulo.net wrote:

 You could mount a RAM disk and point HDFS to it.


 On Tue, Sep 11, 2012 at 9:02 AM, Moore, Matthew J. 
 matthew.j.mo...@saic.com wrote:

 Has anyone run Accumulo on a single server straight from memory?
 Probably using something like a Fusion  IO drive.  We are trying to use it
 without using an SSD or any spinning discs.

 ** **

 *Matthew Moore*

 Systems Engineer

 SAIC, ISBU

 Columbia, MD

 410-312-2542

 ** **





Re: Running Accumulo straight from Memory

2012-09-11 Thread Eric Newton
Accumulo needs something that provides the FileSystem interface.  It also
needs to be distributed, replicated, and provide for a write-ahead log.
 HDFS on a RAM disk pretty much gets you that.

On Tue, Sep 11, 2012 at 12:55 PM, Moore, Matthew J. 
matthew.j.mo...@saic.com wrote:

 Have you tried it where you’re writing to straight block memory?  Not
 using any file system or SATA controller.

 ** **

 Matt


 *Sent:* Tuesday, September 11, 2012 12:19 PM
 *To:* user@accumulo.apache.org
 *Subject:* Re: Running Accumulo straight from Memory

 ** **

 I have run a small cluster with HDFS writing only to a RAM disk.  Is that
 the sort of thing you are interested in?

 ** **

 -Eric

 On Tue, Sep 11, 2012 at 12:02 PM, Moore, Matthew J. 
 matthew.j.mo...@saic.com wrote:

 Has anyone run Accumulo on a single server straight from memory?  Probably
 using something like a Fusion  IO drive.  We are trying to use it without
 using an SSD or any spinning discs.

  

 *Matthew Moore*

 Systems Engineer

 SAIC, ISBU

 Columbia, MD

 410-312-2542

  

 ** **



Re: Running Accumulo straight from Memory

2012-09-11 Thread Adam Fuchs
Matthew,

I don't know of anyone who has done this, but I believe you could:
1. mount a RAM disk
2. point the hdfs core-site.xml fs.default.name property to file:///
3. point the accumulo-site.xml instance.dfs.dir property to a directory on
the RAM disk
4. disable the WAL for all tables by setting the accumulo-site.xml
table.walog.enabled to false
5. initialize and start up accumulo as you regularly would and cross your
fingers

Of course, the you may lose data and this is not an officially supported
configuration caveats apply. Out of curiosity, what would you be trying to
accomplish with this configuration?

Adam


On Tue, Sep 11, 2012 at 12:02 PM, Moore, Matthew J. 
matthew.j.mo...@saic.com wrote:

 Has anyone run Accumulo on a single server straight from memory?  Probably
 using something like a Fusion  IO drive.  We are trying to use it without
 using an SSD or any spinning discs.

 ** **

 *Matthew Moore*

 Systems Engineer

 SAIC, ISBU

 Columbia, MD

 410-312-2542

 ** **