Re: Writting bottleneck in HBase ?

2016-12-03 Thread Ted Yu
I was in China the past 10 days where I didn't have access to gmail.

bq. repeat this sequence a thousand times

You mean proceeding with the next parameter ?

bq. use hashing mechanism to transform this long string

How is the hash generated ?
The hash prefix should presumably evenly distribute the write load.

Thanks

On Thu, Nov 24, 2016 at 8:13 AM, schausson  wrote:

> Hi, thanks for your answer.
>
> About your question related to thread management : yes, I have several
> threads (up to 4) that may call my persistence method.
>
> When I wrote the post, I had not configured anything special about regions
> for my table so it basically used default splitting policy I guess.
> Next to your answer, I gave a try to this :
> /byte[][] splits = new
> RegionSplitter.HexStringSplit().split(numberOfRegionServers);
> /
> Which lead to 12 regions at table creation time.
>
> It slightly improved performances : persistance drops from 2min to 1min40s
> approximately.
>
> I tried with 24 regions but nothing changed then...
>
> About how parameters IDs are distributed : to make it simple, I read 5
> values per parameter (*2000) and call persistence, and repeat this sequence
> a thousand times. So they should distribute accross all my region servers,
> right ?
> One additional clue : parameters ID are alphanumeric, evenly distributed
> between A and Z, but I add a prefix to them which is long string
> (about 25 characters). To save storage space (because rowId is dupplicated
> for each cell), I use hashing mechanism to transform this long string into
> Long value (and I ahev a mapping table next to the main table), so I dont
> really know how these Long values "distribute"...
>
> Not sure I'm clear...
>
>
>
>
>
>
>
>
>
> --
> View this message in context: http://apache-hbase.679495.n3.
> nabble.com/Writting-bottleneck-in-HBase-tp4084656p4084678.html
> Sent from the HBase User mailing list archive at Nabble.com.
>


Re: Hot Region Server With No Hot Region

2016-12-03 Thread Ted Yu
I took a look at the stack trace.

Region server log would give us more detail on the frequency and duration
of compactions.

Cheers

On Sat, Dec 3, 2016 at 7:39 AM, Jeremy Carroll  wrote:

> I would check compaction, investigate throttling if it's causing high CPU.
>
> On Sat, Dec 3, 2016 at 6:20 AM Saad Mufti  wrote:
>
> > No.
> >
> > 
> > Saad
> >
> >
> > On Fri, Dec 2, 2016 at 3:27 PM, Ted Yu  wrote:
> >
> > > Some how I couldn't access the pastebin (I am in China now).
> > > Did the region server showing hotspot host meta ?
> > > Thanks
> > >
> > > On Friday, December 2, 2016 11:53 AM, Saad Mufti <
> > saad.mu...@gmail.com>
> > > wrote:
> > >
> > >
> > >  We're in AWS with D2.4xLarge instances. Each instance has 12
> independent
> > > spindles/disks from what I can tell.
> > >
> > > We have charted get_rate and mutate_rate by host and
> > >
> > > a) mutate_rate shows no real outliers
> > > b) read_rate shows the overall rate on the "hotspot" region server is a
> > bit
> > > higher than every other server, not severely but enough that it is a
> bit
> > > noticeable. But when we chart get_rate on that server by region, no one
> > > region stands out.
> > >
> > > get_rate chart by host:
> > >
> > > https://snag.gy/hmoiDw.jpg
> > >
> > > mutate_rate chart by host:
> > >
> > > https://snag.gy/jitdMN.jpg
> > >
> > > 
> > > Saad
> > >
> > >
> > > 
> > > Saad
> > >
> > >
> > > On Fri, Dec 2, 2016 at 2:34 PM, John Leach 
> > > wrote:
> > >
> > > > Here is what I see...
> > > >
> > > >
> > > > * Short Compaction Running on Heap
> > > > "regionserver/ip-10-99-181-146.aolp-prd.us-east-1.ec2.
> > > > aolcloud.net/10.99.181.146:60020-shortCompactions-1480229281547" -
> > > Thread
> > > > t@242
> > > >java.lang.Thread.State: RUNNABLE
> > > >at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder.
> > > > compressSingleKeyValue(FastDiffDeltaEncoder.java:270)
> > > >at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder.
> > > > internalEncode(FastDiffDeltaEncoder.java:245)
> > > >at org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder.
> > > > encode(BufferedDataBlockEncoder.java:987)
> > > >at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder.
> > > > encode(FastDiffDeltaEncoder.java:58)
> > > >at org.apache.hadoop.hbase.io
> > .hfile.HFileDataBlockEncoderImpl.encode(
> > > > HFileDataBlockEncoderImpl.java:97)
> > > >at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.write(
> > > > HFileBlock.java:866)
> > > >at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(
> > > > HFileWriterV2.java:270)
> > > >at org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(
> > > > HFileWriterV3.java:87)
> > > >at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.
> > > > append(StoreFile.java:949)
> > > >at org.apache.hadoop.hbase.regionserver.compactions.
> > > > Compactor.performCompaction(Compactor.java:282)
> > > >at org.apache.hadoop.hbase.regionserver.compactions.
> > > > DefaultCompactor.compact(DefaultCompactor.java:105)
> > > >at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$
> > > > DefaultCompactionContext.compact(DefaultStoreEngine.java:124)
> > > >at org.apache.hadoop.hbase.regionserver.HStore.compact(
> > > > HStore.java:1233)
> > > >at org.apache.hadoop.hbase.regionserver.HRegion.compact(
> > > > HRegion.java:1770)
> > > >at org.apache.hadoop.hbase.regionserver.CompactSplitThread$
> > > > CompactionRunner.run(CompactSplitThread.java:520)
> > > >at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > > ThreadPoolExecutor.java:1142)
> > > >at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > > ThreadPoolExecutor.java:617)
> > > >at java.lang.Thread.run(Thread.java:745)
> > > >
> > > >
> > > > * WAL Syncs waiting…  ALL 5
> > > > "sync.0" - Thread t@202
> > > >java.lang.Thread.State: TIMED_WAITING
> > > >at java.lang.Object.wait(Native Method)
> > > >- waiting on <67ba892d> (a java.util.LinkedList)
> > > >at org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno(
> > > > DFSOutputStream.java:2337)
> > > >at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(
> > > > DFSOutputStream.java:2224)
> > > >at org.apache.hadoop.hdfs.DFSOutputStream.hflush(
> > > > DFSOutputStream.java:2116)
> > > >at org.apache.hadoop.fs.FSDataOutputStream.hflush(
> > > > FSDataOutputStream.java:130)
> > > >at org.apache.hadoop.hbase.regionserver.wal.
> ProtobufLogWriter.sync(
> > > > ProtobufLogWriter.java:173)
> > > >at org.apache.hadoop.hbase.regionserver.wal.FSHLog$
> > > > SyncRunner.run(FSHLog.java:1379)
> > > >at java.lang.Thread.run(Thread.java:745)
> > > >
> > > > * Mutations backing up very badly...
> > > >
> > > > "B.defaultRpcServer.handler=103,queue=7,port=60020" - Thread t@155
> > > >java.lang.Thread.State: TIMED_WAITING
> > > >at 

Re: Hot Region Server With No Hot Region

2016-12-03 Thread Jeremy Carroll
I would check compaction, investigate throttling if it's causing high CPU.

On Sat, Dec 3, 2016 at 6:20 AM Saad Mufti  wrote:

> No.
>
> 
> Saad
>
>
> On Fri, Dec 2, 2016 at 3:27 PM, Ted Yu  wrote:
>
> > Some how I couldn't access the pastebin (I am in China now).
> > Did the region server showing hotspot host meta ?
> > Thanks
> >
> > On Friday, December 2, 2016 11:53 AM, Saad Mufti <
> saad.mu...@gmail.com>
> > wrote:
> >
> >
> >  We're in AWS with D2.4xLarge instances. Each instance has 12 independent
> > spindles/disks from what I can tell.
> >
> > We have charted get_rate and mutate_rate by host and
> >
> > a) mutate_rate shows no real outliers
> > b) read_rate shows the overall rate on the "hotspot" region server is a
> bit
> > higher than every other server, not severely but enough that it is a bit
> > noticeable. But when we chart get_rate on that server by region, no one
> > region stands out.
> >
> > get_rate chart by host:
> >
> > https://snag.gy/hmoiDw.jpg
> >
> > mutate_rate chart by host:
> >
> > https://snag.gy/jitdMN.jpg
> >
> > 
> > Saad
> >
> >
> > 
> > Saad
> >
> >
> > On Fri, Dec 2, 2016 at 2:34 PM, John Leach 
> > wrote:
> >
> > > Here is what I see...
> > >
> > >
> > > * Short Compaction Running on Heap
> > > "regionserver/ip-10-99-181-146.aolp-prd.us-east-1.ec2.
> > > aolcloud.net/10.99.181.146:60020-shortCompactions-1480229281547" -
> > Thread
> > > t@242
> > >java.lang.Thread.State: RUNNABLE
> > >at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder.
> > > compressSingleKeyValue(FastDiffDeltaEncoder.java:270)
> > >at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder.
> > > internalEncode(FastDiffDeltaEncoder.java:245)
> > >at org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder.
> > > encode(BufferedDataBlockEncoder.java:987)
> > >at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder.
> > > encode(FastDiffDeltaEncoder.java:58)
> > >at org.apache.hadoop.hbase.io
> .hfile.HFileDataBlockEncoderImpl.encode(
> > > HFileDataBlockEncoderImpl.java:97)
> > >at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.write(
> > > HFileBlock.java:866)
> > >at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(
> > > HFileWriterV2.java:270)
> > >at org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(
> > > HFileWriterV3.java:87)
> > >at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.
> > > append(StoreFile.java:949)
> > >at org.apache.hadoop.hbase.regionserver.compactions.
> > > Compactor.performCompaction(Compactor.java:282)
> > >at org.apache.hadoop.hbase.regionserver.compactions.
> > > DefaultCompactor.compact(DefaultCompactor.java:105)
> > >at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$
> > > DefaultCompactionContext.compact(DefaultStoreEngine.java:124)
> > >at org.apache.hadoop.hbase.regionserver.HStore.compact(
> > > HStore.java:1233)
> > >at org.apache.hadoop.hbase.regionserver.HRegion.compact(
> > > HRegion.java:1770)
> > >at org.apache.hadoop.hbase.regionserver.CompactSplitThread$
> > > CompactionRunner.run(CompactSplitThread.java:520)
> > >at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > ThreadPoolExecutor.java:1142)
> > >at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > ThreadPoolExecutor.java:617)
> > >at java.lang.Thread.run(Thread.java:745)
> > >
> > >
> > > * WAL Syncs waiting…  ALL 5
> > > "sync.0" - Thread t@202
> > >java.lang.Thread.State: TIMED_WAITING
> > >at java.lang.Object.wait(Native Method)
> > >- waiting on <67ba892d> (a java.util.LinkedList)
> > >at org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno(
> > > DFSOutputStream.java:2337)
> > >at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(
> > > DFSOutputStream.java:2224)
> > >at org.apache.hadoop.hdfs.DFSOutputStream.hflush(
> > > DFSOutputStream.java:2116)
> > >at org.apache.hadoop.fs.FSDataOutputStream.hflush(
> > > FSDataOutputStream.java:130)
> > >at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(
> > > ProtobufLogWriter.java:173)
> > >at org.apache.hadoop.hbase.regionserver.wal.FSHLog$
> > > SyncRunner.run(FSHLog.java:1379)
> > >at java.lang.Thread.run(Thread.java:745)
> > >
> > > * Mutations backing up very badly...
> > >
> > > "B.defaultRpcServer.handler=103,queue=7,port=60020" - Thread t@155
> > >java.lang.Thread.State: TIMED_WAITING
> > >at java.lang.Object.wait(Native Method)
> > >- waiting on <6ab54ea3> (a org.apache.hadoop.hbase.
> > > regionserver.wal.SyncFuture)
> > >at org.apache.hadoop.hbase.regionserver.wal.SyncFuture.
> > > get(SyncFuture.java:167)
> > >at org.apache.hadoop.hbase.regionserver.wal.FSHLog.
> > > blockOnSync(FSHLog.java:1504)
> > >at org.apache.hadoop.hbase.regionserver.wal.FSHLog.
> > > publishSyncThenBlockOnCompletion(FSHLog.java:1498)
> > >at 

Re: [ANNOUNCE] Apache Phoenix 4.9 released

2016-12-03 Thread Mich Talebzadeh
Many thanks for this announcement.

This is a question that I have been seeking verification.

Does the new release of 4.9.0 of Phoenix support transactional and ACID
compliance on Hbase? In a naïve way can one do what an RDBMS does with a
combination of Hbase + Phoenix!

FYI, I am not interested on add-ons or some beta test tools such as Phoenix
with combination of some other product.

Regards,

Mich




Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 1 December 2016 at 21:31, James Taylor  wrote:

> Apache Phoenix enables OLTP and operational analytics for Apache Hadoop
> through SQL support using Apache HBase as its backing store and providing
> integration with other projects in the ecosystem such as Apache Spark,
> Apache Hive, Apache Pig, Apache Flume, and Apache MapReduce.
>
> We're pleased to announce our 4.9.0 release which includes:
> - Atomic UPSERT through new ON DUPLICATE KEY syntax [1]
> - Support for DEFAULT declaration in DDL statements [2]
> - Specify guidepost width per table [3]
> - Over 40 bugs fixed [4]
>
> The release is available in source or binary form here [5].
>
> Thanks,
> The Apache Phoenix Team
>
> [1] https://phoenix.apache.org/atomic_upsert.html
> [2] https://phoenix.apache.org/language/index.html#column_def
> [3] https://phoenix.apache.org/update_statistics.html
> [4]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12315120=12335845
> [5] https://phoenix.apache.org/download.html
>


Re: Hot Region Server With No Hot Region

2016-12-03 Thread Saad Mufti
No.


Saad


On Fri, Dec 2, 2016 at 3:27 PM, Ted Yu  wrote:

> Some how I couldn't access the pastebin (I am in China now).
> Did the region server showing hotspot host meta ?
> Thanks
>
> On Friday, December 2, 2016 11:53 AM, Saad Mufti 
> wrote:
>
>
>  We're in AWS with D2.4xLarge instances. Each instance has 12 independent
> spindles/disks from what I can tell.
>
> We have charted get_rate and mutate_rate by host and
>
> a) mutate_rate shows no real outliers
> b) read_rate shows the overall rate on the "hotspot" region server is a bit
> higher than every other server, not severely but enough that it is a bit
> noticeable. But when we chart get_rate on that server by region, no one
> region stands out.
>
> get_rate chart by host:
>
> https://snag.gy/hmoiDw.jpg
>
> mutate_rate chart by host:
>
> https://snag.gy/jitdMN.jpg
>
> 
> Saad
>
>
> 
> Saad
>
>
> On Fri, Dec 2, 2016 at 2:34 PM, John Leach 
> wrote:
>
> > Here is what I see...
> >
> >
> > * Short Compaction Running on Heap
> > "regionserver/ip-10-99-181-146.aolp-prd.us-east-1.ec2.
> > aolcloud.net/10.99.181.146:60020-shortCompactions-1480229281547" -
> Thread
> > t@242
> >java.lang.Thread.State: RUNNABLE
> >at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder.
> > compressSingleKeyValue(FastDiffDeltaEncoder.java:270)
> >at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder.
> > internalEncode(FastDiffDeltaEncoder.java:245)
> >at org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder.
> > encode(BufferedDataBlockEncoder.java:987)
> >at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder.
> > encode(FastDiffDeltaEncoder.java:58)
> >at org.apache.hadoop.hbase.io.hfile.HFileDataBlockEncoderImpl.encode(
> > HFileDataBlockEncoderImpl.java:97)
> >at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.write(
> > HFileBlock.java:866)
> >at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(
> > HFileWriterV2.java:270)
> >at org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(
> > HFileWriterV3.java:87)
> >at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.
> > append(StoreFile.java:949)
> >at org.apache.hadoop.hbase.regionserver.compactions.
> > Compactor.performCompaction(Compactor.java:282)
> >at org.apache.hadoop.hbase.regionserver.compactions.
> > DefaultCompactor.compact(DefaultCompactor.java:105)
> >at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$
> > DefaultCompactionContext.compact(DefaultStoreEngine.java:124)
> >at org.apache.hadoop.hbase.regionserver.HStore.compact(
> > HStore.java:1233)
> >at org.apache.hadoop.hbase.regionserver.HRegion.compact(
> > HRegion.java:1770)
> >at org.apache.hadoop.hbase.regionserver.CompactSplitThread$
> > CompactionRunner.run(CompactSplitThread.java:520)
> >at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1142)
> >at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:617)
> >at java.lang.Thread.run(Thread.java:745)
> >
> >
> > * WAL Syncs waiting…  ALL 5
> > "sync.0" - Thread t@202
> >java.lang.Thread.State: TIMED_WAITING
> >at java.lang.Object.wait(Native Method)
> >- waiting on <67ba892d> (a java.util.LinkedList)
> >at org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno(
> > DFSOutputStream.java:2337)
> >at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(
> > DFSOutputStream.java:2224)
> >at org.apache.hadoop.hdfs.DFSOutputStream.hflush(
> > DFSOutputStream.java:2116)
> >at org.apache.hadoop.fs.FSDataOutputStream.hflush(
> > FSDataOutputStream.java:130)
> >at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(
> > ProtobufLogWriter.java:173)
> >at org.apache.hadoop.hbase.regionserver.wal.FSHLog$
> > SyncRunner.run(FSHLog.java:1379)
> >at java.lang.Thread.run(Thread.java:745)
> >
> > * Mutations backing up very badly...
> >
> > "B.defaultRpcServer.handler=103,queue=7,port=60020" - Thread t@155
> >java.lang.Thread.State: TIMED_WAITING
> >at java.lang.Object.wait(Native Method)
> >- waiting on <6ab54ea3> (a org.apache.hadoop.hbase.
> > regionserver.wal.SyncFuture)
> >at org.apache.hadoop.hbase.regionserver.wal.SyncFuture.
> > get(SyncFuture.java:167)
> >at org.apache.hadoop.hbase.regionserver.wal.FSHLog.
> > blockOnSync(FSHLog.java:1504)
> >at org.apache.hadoop.hbase.regionserver.wal.FSHLog.
> > publishSyncThenBlockOnCompletion(FSHLog.java:1498)
> >at org.apache.hadoop.hbase.regionserver.wal.FSHLog.sync(
> > FSHLog.java:1632)
> >at org.apache.hadoop.hbase.regionserver.HRegion.
> > syncOrDefer(HRegion.java:7737)
> >at org.apache.hadoop.hbase.regionserver.HRegion.
> > processRowsWithLocks(HRegion.java:6504)
> >at org.apache.hadoop.hbase.regionserver.HRegion.
> > mutateRowsWithLocks(HRegion.java:6352)
> >at