The ulimit open files was only 1024 for the user executing the query. After increasing, the queries behaves better.
How can we tell if we need to reduce/increase the number of salt buckets? Our team set this based on read/write performance using data volume and expected queries to be run by users. However, now it seems the performance has degraded. We can recreate the schemas using fewer/more buckets and reload the data, but I haven't seen a hard and fast rule for setting the number of buckets. We have 12 data nodes, 4 SSDs per node, 128 GB Ram per node, 24 core w/ hyperthreading (HDP 2.5 running, hbase is primary service). and 800+ regions per RS (seems high) Any orientation on this would be greatly appreciated. On Tue, Jun 20, 2017 at 11:54 AM, Josh Elser <josh.el...@gmail.com> wrote: > I think this is more of an issue of your 78 salt buckets than the width of > your table. Each chunk, running in parallel, is spilling incremental counts > to disk. > > I'd check your ulimit settings on the node which you run this query from > and try to increase the number of open files allowed before going into this > one in more depth :) > > > On 6/16/17 2:31 PM, Michael Young wrote: > >> >> We are running a 13-node hbase cluster. One table uses 78 SALT BUCKETS >> which seems to work reasonable well for both read and write. This table >> has 130 columns with a PK having 30 columns (fairly wide table). >> >> However, after adding several new tables we are seeing errors about too >> many open files when running a full table scan. >> >> >> Caused by: org.apache.phoenix.exception.PhoenixIOException: Too many >> open files >> at org.apache.phoenix.util.ServerUtil.parseServerException(Serv >> erUtil.java:111) >> at org.apache.phoenix.iterate.SpoolingResultIterator.<init>(Spo >> olingResultIterator.java:152) >> at org.apache.phoenix.iterate.SpoolingResultIterator.<init>(Spo >> olingResultIterator.java:84) >> at org.apache.phoenix.iterate.SpoolingResultIterator.<init>(Spo >> olingResultIterator.java:63) >> at org.apache.phoenix.iterate.SpoolingResultIterator$SpoolingRe >> sultIteratorFactory.newIterator(SpoolingResultIterator.java:79) >> at org.apache.phoenix.iterate.ParallelIterators$1.call(Parallel >> Iterators.java:112) >> at org.apache.phoenix.iterate.ParallelIterators$1.call(Parallel >> Iterators.java:103) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask. >> run(JobManager.java:183) >> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >> Executor.java:1142) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >> lExecutor.java:617) >> at java.lang.Thread.run(Thread.java:745) >> Caused by: java.io.IOException: Too many open files >> at java.io.UnixFileSystem.createFileExclusively(Native Method) >> at java.io.File.createTempFile(File.java:2024) >> at org.apache.phoenix.shaded.org.apache.commons.io.output.Defer >> redFileOutputStream.thresholdReached(DeferredFileOutputStream.java:176) >> at org.apache.phoenix.iterate.SpoolingResultIterator$1.threshol >> dReached(SpoolingResultIterator.java:116) >> at org.apache.phoenix.shaded.org.apache.commons.io.output.Thres >> holdingOutputStream.checkThreshold(ThresholdingOutputStream.java:224) >> at org.apache.phoenix.shaded.org.apache.commons.io.output.Thres >> holdingOutputStream.write(ThresholdingOutputStream.java:92) >> at java.io.DataOutputStream.writeByte(DataOutputStream.java:153) >> at org.apache.hadoop.io.WritableUtils.writeVLong(WritableUtils. >> java:273) >> at org.apache.hadoop.io.WritableUtils.writeVInt(WritableUtils. >> java:253) >> at org.apache.phoenix.util.TupleUtil.write(TupleUtil.java:149) >> at org.apache.phoenix.iterate.SpoolingResultIterator.<init>(Spo >> olingResultIterator.java:127) >> ... 10 more >> >> >> When running an explain plan: >> explain select count(1) from MYBIGTABLE >> >> +----------------------------------------------------------- >> -------------------------------------------------------+ >> | PLAN >> | >> +----------------------------------------------------------- >> -------------------------------------------------------+ >> | CLIENT 8728-CHUNK 674830174 ROWS 2721056772632 BYTES PARALLEL 78-WAY >> FULL SCAN OVER ATT.PRE_ENG_CONVERSION_OLAP | >> | ROW TIMESTAMP FILTER [0, 9223372036854775807) >> | >> | SERVER FILTER BY FIRST KEY ONLY >> | >> | SERVER AGGREGATE INTO SINGLE ROW >> | >> +----------------------------------------------------------- >> -------------------------------------------------------+ >> >> I has a lot of chunks. Normally this query would return at least some >> result after running for a few minutes. With appropriate filters in the >> WHERE clause, the queries run fine. >> >> Any suggestions on how to avoid this error and get better performance >> from the table scans? Realizing that we don't need to run full table scans >> regularly, just trying to understand better best practices for Phoenix >> Hbase. >> >> Thank you, >> Michael >> >