Thanks for the help. It definitely looks like the move to 0.90 would resolve many of these issues.
-chris On Feb 15, 2011, at 2:33 PM, Jean-Daniel Cryans wrote: > That would make sense... although I've done testing and the more files > you have to split, the longer it takes to create the reference files > so the longer the split. Now that I think of it, with your high > blocking store files setting, you may be running into an extreme case > of https://issues.apache.org/jira/browse/HBASE-3308 > > J-D > > On Tue, Feb 15, 2011 at 2:27 PM, Chris Tarnas <[email protected]> wrote: >> No swapping, about 30% of the total CPU is idle, looking through ganglia I >> do see a spike in cpu_wio at that time - but only to 2%. My suspect though >> is GZ compression is just taking a while. >> >> >> >> On Feb 15, 2011, at 2:10 PM, Jean-Daniel Cryans wrote: >> >>> Yeah if it's the same key space that splits, it could explain the >>> issue... 65 seconds is a long time! Is there any swapping going on? >>> CPU or IO starvation? >>> >>> In that context I don't see any problem setting the pausing time higher. >>> >>> J-D >>> >>> On Tue, Feb 15, 2011 at 1:54 PM, Chris Tarnas <[email protected]> wrote: >>>> Hi JD, >>>> >>>> Two splits happened within 90 seconds of each other on one server - one >>>> took 65 seconds, the next took 43 seconds. with only a 10 second timeout >>>> (10 tries, 1 second between) I think that was the issue. Are their any >>>> hidden issues to raising those retry parameters so I can withstand a 120 >>>> second pause? >>>> >>>> thanks, >>>> -chris >>>> >>>> On Feb 15, 2011, at 1:37 PM, Chris Tarnas wrote: >>>> >>>>> >>>>> On Feb 15, 2011, at 11:32 AM, Jean-Daniel Cryans wrote: >>>>> >>>>>> On Tue, Feb 15, 2011 at 11:24 AM, Chris Tarnas <[email protected]> wrote: >>>>>>> We are definitely considering writing a bulk loader, but as it is this >>>>>>> fits into an existing processing pipeline that is not Java and does not >>>>>>> fit into the importtsv tool (we use column names as data as well) we >>>>>>> have not done it yet. I do foresee a Java bulk loader in our future >>>>>>> though. >>>>>> >>>>>> Well I was referring to THE bulk loader: >>>>>> http://hbase.apache.org/bulk-loads.html >>>>>> >>>>> >>>>> It has the same problem really for us. Also - does that needs 0.92 for >>>>> multi-column support? I'm pretty sure we will be moving to a bulk loader >>>>> soon. >>>>> >>>>>>> >>>>>>> Does the shell expose the createTable method that defines the number of >>>>>>> columns (or I suppose I'll probably need to brush up on my JRuby...). >>>>>>> Splits were definitely happening then. Currently I'm using 1GB regions, >>>>>>> I'll probably go larger (~5) and salt my keys to distribute them better. >>>>>> >>>>>> I don't think that method is in the shell, it'd be weird anyway to >>>>>> write down hundreds of bytes in the shell IMO... Do you see a region >>>>>> hotspots? If so, definitely solve the key distribution as it's going >>>>>> to kill your performance. Bigger regions won't really help if you're >>>>>> still always writing to the same few ones. >>>>>> >>>>> >>>>> We use schema files that we redirect into the shell like DDL. My other >>>>> reason to go to large reasons was we are going to have lots of older data >>>>> as well. The top few loads will be hot and used most often but we do need >>>>> access to the older data as well. I foresee up to about 2-4 billion rows >>>>> a week, so at the rate we are creating these tables that would be quite a >>>>> few regions per server at 1GB regions. >>>>> >>>>>>> >>>>>>> The reason I had thought it might be compaction related is I saw that >>>>>>> we had hit the hbase.hstore.blockingStoreFiles limit as well as having >>>>>>> the timeout expire. >>>>>>> >>>>>> >>>>>> Well the writes would block on flushing, so unless all the handlers >>>>>> are filled then you shouldn't see retries exhausted. You could grep >>>>>> your logs to see how log the splits took btw, but the total locking >>>>>> time isn't exactly that time... it's less than that. 0.90.1 would >>>>>> definitely help here. >>>>>> >>>>> >>>>> Most splits look to be about 5-7 seconds. I'll investigate more around >>>>> the error times and see if any were longer. >>>>> >>>>> We'll be upgrading next week. >>>>> >>>>> Thanks again! >>>>> -chris >>>>>> >>>>> >>>> >>>> >> >>
