RE: Efficient Tablet Merging [SEC=UNOFFICIAL]

Dickson, Matt MR Thu, 03 Oct 2013 20:21:57 -0700

UNOFFICIAL

Hi Eric,
Our answers are in blue. Just a note that we do have the write ahead log 
disabled for ingest performance.
We have a public holiday on Monday, so we may be delayed in our response.

Cheers
Matt

________________________________
From: Eric Newton [mailto:[email protected]]
Sent: Friday, 4 October 2013 11:20
To: [email protected]
Subject: Re: Efficient Tablet Merging [SEC=UNOFFICIAL]

Any errors on those servers?  Each server should be checking periodically for 
compactions, some crazy errors might escape error handling, though that is rare 
these days.
In the tserver debug log there is a repeating error of  "Internal error 
processing applyUpdates  
org.apache.accumulo.server.tabletserver.HoldTimeoutException: Commits are held"

Also found in the tserver log:
ERROR: Failed to find midpoint Filesystem closed
WARN: Tablet .... has too many files, batch lookup cannont run

Are you experiencing any table level errors?  Unable to read or write files?
No table level errors or read errors

How full is HDFS?
32%

If you scan the !METADATA table, are you seeing any trend in the tablets that 
have problems?
By getting the extent id of the tablets that are large and then finding the 
range of that tablet by using 'getsplits -v' I have scanned the !METADATA table 
and can see a massive number of *.rf files associated with the range.  Is there 
anything particular I should look at.

At this point, we're looking for logged anomalies, the earlier the better.  
Anything red or yellow on the monitor pages.
I ran one of the scans that hang and then see the following:

Several "WARN Exception sying java.lang.reflect.InvocationTargetException"

Several "ERROR  Unexpected error writing to log, retrying attempt 1
    InvocationTargetException
    Caused by LeaseExpiredException: Lease mismatch on /accumulo/wal/... owned 
by DFSClient_NOMAPREDUCE_56390516_13 but is accessed by 
DFSClient_NOMAPREDUCE_1080760417_13"

"ERROR TTransportException: javav.net.SocketTimeoutException: ... while waiting 
for channel to be ready for write. ...."

Bunch of "WARN Tablet 234234234 has too many files..."

On Thu, Oct 3, 2013 at 8:43 PM, Dickson, Matt MR 
<[email protected]<mailto:[email protected]>> wrote:

UNOFFICIAL

We have restarted the tablet servers that contain tablets with high volumes of 
files and did not see any majc's run.

Some more details are:
On 3 of our nodes we have 10-15 times the number of entries that are on the 
other nodes.  When I view the tablets for one of these nodes there are 2 
tablets with almost 10 times the the number of entries as the others.

When we query on the date rowid's the queries are now hanging and there are 
several scans running on the 3 nodes that have higher entries and they are not 
completing, can I cancel these?

In the logs we are getting "tablet ..... has too many files, batch lookup can 
not run"

At this point I'm stuck for ideas, so any suggestions would be great.

________________________________
From: Eric Newton [mailto:[email protected]<mailto:[email protected]>]
Sent: Thursday, 3 October 2013 23:52

To: [email protected]<mailto:[email protected]>
Subject: Re: Efficient Tablet Merging [SEC=UNOFFICIAL]

You should have a major compaction running if your tablet has too many files.  
If you don't, something is wrong. It does take some time to re-write 10G of 
data.

If many merges occurred on a single tablet server, you may have these many-file 
tablets on the same server, and there are not enough major compaction threads 
to re-write those files right away.  If that's true, you may wish to restart 
the tablet server in order to get the tablets pushed to other idle servers.

Again, if you don't have major compactions running, you will want to start 
looking for other problems.

-Eric

On Thu, Oct 3, 2013 at 2:29 AM, Dickson, Matt MR 
<[email protected]<mailto:[email protected]>> wrote:

UNOFFICIAL

Hi Eric,

We have gone with the second more conservative option. We changed our split 
threshold to 10GB and then we ran a merge over a week worth of tablets which 
has resulted in one tablet with a massive number of files. We then ran a query 
over that range and it is returning an message saying:

Tablet has too many files (3n;20130914;20130907...) retrying...

We assumed that when the merge was done that a major compaction would be 
started, which would notice that the tablet is too large, split it into 10GB 
tablets. We assumed that we would not have to manually start any compaction but 
instead it would be scheduled at some point after the merge finished.

We have completed three separate merges of week long ranges and now have 
identified 3 tablet extents with too many files.

Can you please explain what is supposed to happen? And whether after the merge, 
compact command for those ranges needs to be run (or will it do it 
automatically, as we have not seen any started)?

Cheers
Matt

________________________________
From: Eric Newton [mailto:[email protected]<mailto:[email protected]>]
Sent: Thursday, 3 October 2013 13:28
To: [email protected]<mailto:[email protected]>
Subject: Re: Efficient Tablet Merging [SEC=UNOFFICIAL]

I'll use ASCII graphics to demonstrate the size of a tablet.

Small: []
Medium: [ ]
Large: [  ]

Think of it like this... if you are running age-off... you probably have lots 
of little buckets of rows at the beginning and larger buckets at the end:

[][][][][][][][][]...[ ][ ][ ][ ][ ][  ][  ][    ][    ][    ][    ][    ][    ]

What you probably want is something like this:

[               ][       ][       ][       ][       ][       ][       ][       ]

Some big bucket at the start, with old data, and some larger buckets for 
everything afterwards.  But... this would probably work:

[       ][       ][       ][       ][       ][       ][       ][       ][       
]

Just a bunch of larger tablets throughout.

So you need to set your merge size to "[      ]" (4G), and you can always keep 
creating smaller tablets for future rows with manual splits:

[       ][       ][       ][       ][       ][       ][       ][       ][       
][  ][  ][  ][  ][  ]

So increase the split threshold to 4G, and merge on 4G, but continue to make 
manual splits for your current days, as necessary.  Merge them away later.

-Eric

On Wed, Oct 2, 2013 at 6:35 PM, Dickson, Matt MR 
<[email protected]<mailto:[email protected]>> wrote:

UNOFFICIAL

Thanks Eric,

If I do the merge with size of 4G does the split threshold need to be increased 
to the 4G also?

________________________________
From: Eric Newton [mailto:[email protected]<mailto:[email protected]>]
Sent: Wednesday, 2 October 2013 23:05
To: [email protected]<mailto:[email protected]>
Subject: Re: Efficient Tablet Merging [SEC=UNOFFICIAL]

The most efficient way is kind of scary.  If this is a production system, I 
would not recommend it.

First, find out the size of your 10x tablets.  Let's say it's 10G.  Set your 
split threshold to 10G.  Then merge all old tablets.... all of them into one 
tablet.  This will dump thousands of files into a single tablet, but it will 
soon split out again into the nice 10G tablets you are looking for.  The system 
will probably be unusable during this operation.

The more conservative way is to specify the merge in single steps (the master 
will only coordinate a single merge on a table at a time anyhow).  You can do 
it by range or by size... I would do it by size, especially if you are aging 
off your old data.

Compacting the data won't have any effect on the speed of the merge.

-Eric

On Tue, Oct 1, 2013 at 11:58 PM, Dickson, Matt MR 
<[email protected]<mailto:[email protected]>> wrote:

UNOFFICIAL

I have a table that we create splits of the form yyyymmdd-nnnn where nnnn 
ranges from 0000 to 0840.  The bulk of our data is loaded for the current date 
with no data loaded for days older than 3 days so from my understanding it 
would be wise to merge splits older than 3 days in order to reduce the overall 
tablet count.  It would still be optimal to maintain some distribution of 
tablets for a day across the cluster so I'm looking at merging splits in 10 
increments eg, merge -b 20130901-0000 -e 20130901-0009, therefore reducing 840 
splits per day to 84.

Currently we have 120K tablets (size 1G) on a cluster of 56 nodes and our 
ingest has slowed as the data quantity and tablet count has grown.  Initialy we 
were achieving 200-300K, now 50-100K.

My question is, what is the best way to do this merge?  Should we use the merge 
command with the size option set at something like 5G, or maybe use the 
compaction command?

>From my tests this process could take some time so I'm keen to understand the 
>most efficient approach.

Thanks in advance,
Matt Dickson

RE: Efficient Tablet Merging [SEC=UNOFFICIAL]

Reply via email to