Re: [External] Re: Accumulo with S3

2023-01-20 Thread Jeff Kubina
You might want to look at this repo https://github.com/Accumulo-S3/accumulo-s3-fs/tree/main Jeff On Fri, Jan 20, 2023 at 1:02 PM Samudrala, Ranganath [USA] via user < user@accumulo.apache.org> wrote: > In the accumulo-env.sh, we are setting the location of HADOOP_CONF_DIR as > below and addin

Microsoft Ignite Talk on Accumulo in Azure

2019-12-09 Thread Jeff Kubina
If you have not seen it already Joel Yoker gave an interesting talk at Ignite about getting Apache Accumulo running in Azure. His talk starts at about 38m 38s into the video on this page https://myignite.techcommunity.microsoft.com/sessions/83924?source=sessions

Does Accumulo 2.0 support OpenJDK 11.0.3 or higher?

2019-09-16 Thread Jeff Kubina
Does Accumulo 2.0 support OpenJDK 11.0.3 or higher? -- Jeff

Re: Force to redistribute tablets among available tservers

2018-07-05 Thread Jeff Kubina
re quickly. -- Jeff Kubina 410-988-4436 On Thu, Jul 5, 2018 at 5:45 AM, Maxim Kolchin wrote: > Hi Jeff, > > > an individual table by taking it offline and then online > > Thank you! I'll try that. But it means that it's not possible to do while > the table is being u

Re: Force to redistribute tablets among available tservers

2018-07-04 Thread Jeff Kubina
servers when it comes back online. -- Jeff Kubina 410-988-4436 On Wed, Jul 4, 2018 at 7:23 AM, Maxim Kolchin wrote: > Hi all, > > Imagine a cluster with one tserver which hosts N tablets. At some point, I > decide to add another tserver to the cluster. Is it possible force Accumulo &g

Does a hard shutdown of accumulo clear all fates in zookeeper

2017-04-05 Thread Jeff Kubina
I had three fates in the accumulo fate queue that had no tasks assigned to them and could not be deleted or failed. All create/delete/clone etc commands were timing out. I did a hard shutdown of all services. I had planned to do "accumulo org.apache.accumulo.server.fate.Admin kill " but that comman

Re: New Accumulo Blog Post

2016-12-20 Thread Jeff Kubina
Chris, Any status on the patch to Accumulo to allow customizing the HDFS volume on which the WALs are stored. -- Jeff Kubina 410-988-4436 On Wed, Nov 2, 2016 at 10:34 PM, Christopher wrote: > I'm aware of at least one person who has patched Accumulo to allow > customizing the

Re: New Accumulo Blog Post

2016-11-02 Thread Jeff Kubina
Thanks for the blog post, very interesting read. Some questions ... 1. Are the operations "Writes mutation to tablet servers’ WAL/Sync or flush tablet servers’ WAL" and "Adds mutations to sorted in memory map of each tablet." performed by threads in parallel? 2. Could the latency of hsync-ing the

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-21 Thread Jeff Kubina
Mike, Yes, thanks for the help. We had to delete the recovered files generated from the WAL a few times but that worked. Then we restarted the two tablets with the TProtocolException exceptions to fix those errors. We saved off the log files for you. Jeff -- Jeff Kubina 410-988-4436 On Fri

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Jeff Kubina
On Tue, Oct 18, 2016 at 6:32 AM, Michael Wall wrote: > Take a look at the master logs for where the WAL was sorted to the > /accumulo/recovery/... > directory. Then look to see if those WALs are still around and contain > content. > Checked one of them, yes it is around with content. Where is

java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-17 Thread Jeff Kubina
We had a lot of datanodes lock up nearly simaltanuously in our Accumulo instance. Many more of the tservers also went offline. After about two hours we were able to get all the datanodes and tservers back online with no HDFS blocks lost. However we have two tservers throwing about 70 exceptions cau

Re: Lost tablet server lock..SESSION_EXPIRED

2016-10-14 Thread Jeff Kubina
txt> documentation with accumulo is helpful in finding latency issues too. -- Jeff Kubina 410-988-4436 On Thu, Oct 13, 2016 at 10:49 AM, Noe Detore wrote: > Yes, seeing a lot of DEBUG:Upsess. Also seeing > [server.GarbageCollectionLogger] > DEBUG: gc ParNew=64.69(+1.24) secs Conc

Re: How can I remove or fix a hanging droptable in Accumulo?

2016-10-12 Thread Jeff Kubina
Lots of bulk ingest assignments. It's been hung for days and even returned after a reboot of the master and tservers. -- Jeff Kubina 410-988-4436 On Wed, Oct 12, 2016 at 6:28 PM, Michael Wall wrote: > Jeff, > > What is the master doing while the droptable is hung? > > Mik

How can I remove or fix a hanging droptable in Accumulo?

2016-10-12 Thread Jeff Kubina
How can I remove or fix a hanging droptable in Accumulo?

Re: Lost tablet server lock..SESSION_EXPIRED

2016-10-07 Thread Jeff Kubina
Noe, Do you have a lot (1000s) of "[tserver.TableServer] DEBUG: UpSess ..." messages in your tserver logs prior to the FATAL or "ERROR: Lost tablet server lock" error message? Jeff -- Jeff Kubina 410-988-4436 On Fri, Oct 7, 2016 at 10:34 AM, Noe Detore wrote: > An

Re: setting tserver configs from the accumulo shell

2016-10-04 Thread Jeff Kubina
sed on > demand. Some of those on demand properties are probably cached into > internal state for indefinite periods of time. It's hard to say which are > which without investigating each property individually (or through > empirical testing). > > On Tue, Oct 4, 2016 at 9:04 P

Re: setting tserver configs from the accumulo shell

2016-10-04 Thread Jeff Kubina
That would be very helpful, but a note in the documentation would be fine initially. Is there an easy way to determine this from the source code? -- Jeff Kubina 410-988-4436 On Tue, Oct 4, 2016 at 8:59 PM, Christopher wrote: > Some do, some don't. One thing we could add to the sh

setting tserver configs from the accumulo shell

2016-10-04 Thread Jeff Kubina
Does changing the values of tserver configs in the accumulo shell, like "config -s tserver.server.threads.minimum=256", require a restart of all the tservers to become effective?

how do I list user permissions per table

2016-09-23 Thread Jeff Kubina
>From the accumulo shell how do I list all the users who have access to a specific table?

Re: Tuned performance profiles for Accumulo

2016-08-30 Thread Jeff Kubina
On Tue, Aug 30, 2016 at 3:51 PM, Christopher wrote: > Has anybody used tuned ("tune D") to manage their system performance > profiles on an Accumulo cluster? > Yes, we use it a lot. > I've recently been looking into tuned, and found it a very convenient tool > for switching between performance

Re: default for tserver.total.mutation.queue.max increased from 50M to 1M in 1.7

2016-07-07 Thread Jeff Kubina
Interesting, is it only flushed when the buffer is full or is there a time limit on it also? For example, if 25M of mutations are written and no more when is the buffer flushed? -- Jeff Kubina 410-988-4436 On Thu, Jul 7, 2016 at 11:50 AM, Christopher wrote: > The change was introduced

default for tserver.total.mutation.queue.max increased from 50M to 1M in 1.7

2016-07-07 Thread Jeff Kubina
buffer is flushed to calculate how this effects performance? -- Jeff Kubina

Re: Read and writing rfiles

2015-12-23 Thread Jeff Kubina
> Are the hadoop nodes handling your map-reduce job also running tservers? > Yes. Do the Accumulo log files show the exception? If so, can you post it? Yes, but nothing helpful to track down the cause, it was a very sparse error message. I will try to post the full error messages.

Read and writing rfiles

2015-12-23 Thread Jeff Kubina
I've have a mapreduce job that reads rfiles as Accumulo key/value pairs using FileSKVIterator within a RecordReader, partition/shuffles them based on the byte string of the key, and writes them out as new rfiles using the AccumuloFileOutputFormat. The objective is to create larger rfiles for bulk i

Re: How does Accumulo process a r-files for bulk ingesting?

2015-10-16 Thread Jeff Kubina
So if a table has splits 0, 1, 2, 3, 4 and I create an rfile with only splits in the range 1-2 and 3-4, after bulk ingesting will the tablet with range 2-3 also be assigned the rfile? -- Jeff Kubina 410-988-4436 On Wed, Oct 7, 2015 at 12:05 PM, Jeff Kubina wrote: > Moved this thread to

What is the optimal number of tablets for a large table?

2015-10-09 Thread Jeff Kubina
I read the following from the Accumulo manual on tablet merging : > Over time, a table can get very large, so large that it has hundreds of > thousands of split points. Once there are enough tablets to spread a table > a

Re: How does Accumulo process a r-files for bulk ingesting?

2015-10-07 Thread Jeff Kubina
Moved this thread to dev@. -- Jeff Kubina 410-988-4436 On Wed, Oct 7, 2015 at 11:18 AM, Josh Elser wrote: > I'd say email is good for discussion on design (dev@ instead of user@ > would probably be more appropriate though). JIRA works better when it gets > down to implemen

Re: How does Accumulo process a r-files for bulk ingesting?

2015-10-07 Thread Jeff Kubina
Okay, shall I flesh out some details of the performance testing code in this thread or a JIRA? -- Jeff Kubina 410-988-4436 On Wed, Oct 7, 2015 at 10:57 AM, Josh Elser wrote: > Jeff Kubina wrote: > >> So has testing been done to determine how much a lack of data locality >&

Re: How does Accumulo process a r-files for bulk ingesting?

2015-10-07 Thread Jeff Kubina
So has testing been done to determine how much a lack of data locality of bulk ingest files effects query performance?

Re: How does Accumulo process a r-files for bulk ingesting?

2015-10-07 Thread Jeff Kubina
So if the HDFS has a replication factor of m and an r-file has a range that intersects n tablets, then data-locality will never be achieved for max(0,n-m) of the r-files, that is, they will never be on the same node as their tablet server until compaction, correct? -- Jeff Kubina 410-988-4436

How does Accumulo process a r-files for bulk ingesting?

2015-10-07 Thread Jeff Kubina
How does Accumulo process an r-file for bulk ingesting when the key range of an r-file is within one tablet's key range and when the key range of an r-file spans two or more tablets? If the r-file is within one tablet's range I thought the file was "just renamed" and added to the tablet's list of

Re: Question about configuring the linux niceness of tablet servers?

2015-08-18 Thread Jeff Kubina
> > I haven't heard of anybody setting the niceness of the Accumulo processes > > before. Are you experiencing a lot of CPU contention on your nodes, such > > that you need to prioritize processes? > Yes, we have had mapreduce jobs "lock out" the tserver so long that the master removes them from i

Re: Question about configuring the linux niceness of tablet servers?

2015-08-17 Thread Jeff Kubina
More like a mapreduce task process. -- Jeff Kubina 410-988-4436 On Mon, Aug 17, 2015 at 5:33 PM, William Slacum wrote: > By "Hadoop" do you mean a Yarn NodeManager process? > > On Mon, Aug 17, 2015 at 4:21 PM, Jeff Kubina > wrote: > >> On each of the processi

Question about configuring the linux niceness of tablet servers?

2015-08-17 Thread Jeff Kubina
On each of the processing nodes in our cluster we have running 1) HDFS (datanode), 2) Accumulo (tablet server), and 3) Hadoop. Since Accumulo depends on the HDFS, and Hadoop depends on the HDFS and sometimes on Accumulo, we are considering setting the niceness of HDFS to 0 (the current value), Accu

Re: questions regarding accumulo tracing

2015-08-13 Thread Jeff Kubina
On Thu, Aug 13, 2015 at 2:52 PM, Josh Elser wrote: > 1. Regarding the information above about accumulo tracing, if more than >> one server is listed in $ACCUMULO_HOME/conf/tracers how do the clients >> select the trace server to send their trace data to? >> > > Tracers register themselves in ZooK

questions regarding accumulo tracing

2015-08-13 Thread Jeff Kubina
> > To collect traces, Accumulo needs at least one server listed in > $ACCUMULO_HOME/conf/tracers. The server collects traces from clients and > writes them to the trace table. 1. Regarding the information above about accumulo tracing, if more than one server is listed in $ACCUMULO_HOME/conf/trac

Question about the TableInfo class

2014-12-08 Thread Jeff Kubina
In the TableInfo class is tablets the total number of tablets a

Re: (U) I do I tell Accumulo where the jars for my custom formatters and balancers are for a specific table?

2014-08-25 Thread Jeff Kubina
rs to accumulo-site.xml? -- Jeff Kubina 410-988-4436 On Mon, Aug 25, 2014 at 3:20 PM, wrote: > Jeff, > > Which version of Accumulo are you using? Most of the classpath settings > are in the accumulo-site.xml file. > > > ------ > *From:

Re: (U) I do I tell Accumulo where the jars for my custom formatters and balancers are for a specific table?

2014-08-25 Thread Jeff Kubina
Sorry, that was suppose to read "How do I tell Accumulo where the jars containing my custom formatters and balancers are for a specific table?" -- Jeff Kubina 410-988-4436 On Mon, Aug 25, 2014 at 2:03 PM, Jeff Kubina wrote: > (U) I do I tell Accumulo where the jars for my cust

(U) I do I tell Accumulo where the jars for my custom formatters and balancers are for a specific table?

2014-08-25 Thread Jeff Kubina
(U) I do I tell Accumulo where the jars for my custom formatters and balancers are for a specific table?

What API call can I use to get the number of (online) tablet servers of an Accumulo instance?

2013-09-17 Thread Jeff Kubina
What API call can I use to get the number of (online) tablet servers of an Accumulo instance?

How to efficiently find lexicographically adjacent records?

2013-08-07 Thread Jeff Kubina
I have records in an Accumulo table where the key is about a 50 long byte string. Given a new key k, I want to find the m records that would precede and succeed the record if it were inserted into the table. Any ideas on how I can do this efficiently? The record will eventually be inserted into

Re: Is there a method in the accumulo api to get the total bytes used and/or total key/value pairs for each tablet?

2013-04-11 Thread Jeff Kubina
So how would I get the info from the master, or could you point me to the monitor code? Thanks. -- Jeff Kubina 410-988-4436 On Thu, Apr 11, 2013 at 11:06 AM, Keith Turner wrote: > On Thu, Apr 11, 2013 at 7:15 AM, Jeff Kubina > wrote: > > Is there a method in the accumulo ap

Is there a method in the accumulo api to get the total bytes used and/or total key/value pairs for each tablet?

2013-04-11 Thread Jeff Kubina
Is there a method in the accumulo api to get the total bytes used and/or total key/value pairs for each tablet? I believe I can get the total bytes used per tablet using HDFS file size calls on the tables directory, but what about the total key/value pairs for each tablet?

question about the values in the Ingest column on the monitor page

2013-04-09 Thread Jeff Kubina
On the Accumulo monitor page (http://accumulo.apache.org/screenshots.html) the Ingest column contains values that are some rate of ingest of rows, if so, what is the time period that the rate is measured over?

Re: What is the Communication and Time Complexity for Bulk Inserts?

2012-10-24 Thread Jeff Kubina
rting with the BatchWriter is eventually dominated by compactions, > which is a merge sort, or O(n log n). > > -Eric > > On Thu, Oct 18, 2012 at 11:37 AM, Jeff Kubina > wrote: > > BatchWriter, but I would be interested in the answer assuming a > > pre-sorted rfile. > > &g

Re: What is the Communication and Time Complexity for Bulk Inserts?

2012-10-18 Thread Jeff Kubina
BatchWriter, but I would be interested in the answer assuming a pre-sorted rfile. On Thu, Oct 18, 2012 at 11:20 AM, Josh Elser wrote: > Are you referring to "bulk inserts" as importing a pre-sorted rfile of > Key/Values or usinga BatchWriter? > > On 10/18/12 10:49 AM, Jeff