[ 
https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13699375#comment-13699375
 ] 

stack commented on HBASE-6295:
------------------------------

On:

{code}
the defaults in the code
hbase-defaults.xml in hbase-common (seems to be used when do the integration 
test with a cluster)
hbase-site.xml in hbase-server/test (seems to be used when you run the 
integration test with a minicluster)
hbase-site.xml in hbase-client
hbase-site.xml in conf
{code}

Removing hadoop-default.xml is a radical notion.  hbase-default.xml used to be 
in conf for all to view and adapt into an hbase-site.xml.  hbase-3090 moved it 
out of conf and into jar so that new installs picked up new defaults.  This 
made hbase-default.xml content effectively opaque unless you undid the jar or 
went to the refguide to read the doc. we generate from it (See 
http://hbase.apache.org/book.html#hbase.site)  My guess is no one looks at the 
refguide.  This would seem to rendor hbase-default.xml near useless?   Yet we 
have to maintain it.  In the configuration code, we'll favor the hbase-default* 
setting over what we have in code.

If we remove it, then we'll only use what is in code.  Means we won't have list 
of configs. in doc. w/ their descriptions.

We could generate a class from the hbase-default.xml src that wrote out a 
Constants java file which had in it defines that we'd use as default whenever 
we did Configuration#getInt.  If you added something to hbase-default.xml, 
you'd have to use a constant.  Would mean a script run against the src that 
would fail if it found something in hbase-default.xml that had a default in 
code that was not an upper-case constant?

The hbase-site.xml in conf is empty always.  Probably better named 
hbase-site.xml.template.

The other hbase-site.xmls are configs for the local tests.  Notion is that 
tests have shorter timeouts and retries than what we ship as our defaults.  Do 
we want to reexamine this and have the hbase defaults true for tests too?

Thanks Elliott and Nicolas for figuring this one out.


                
> Possible performance improvement in client batch operations: presplit and 
> send in background
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6295
>                 URL: https://issues.apache.org/jira/browse/HBASE-6295
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client, Performance
>    Affects Versions: 0.95.2
>            Reporter: Nicolas Liochon
>            Assignee: Nicolas Liochon
>              Labels: noob
>             Fix For: 0.98.0, 0.95.2
>
>         Attachments: 6295.addendum.patch, 6295.v11.patch, 6295.v12.patch, 
> 6295.v14.patch, 6295.v15.patch, 6295.v1.patch, 6295.v2.patch, 6295.v3.patch, 
> 6295.v4.patch, 6295.v5.patch, 6295.v6.patch, 6295.v8.patch, 6295.v9.patch, 
> hbase-ycsb-workloads Build time trend.png
>
>
> today batch algo is:
> {noformat}
> for Operation o: List<Op>{
>   add o to todolist
>   if todolist > maxsize or o last in list
>     split todolist per location
>     send split lists to region servers
>     clear todolist
>     wait
> }
> {noformat}
> We could:
> - create immediately the final object instead of an intermediate array
> - split per location immediately
> - instead of sending when the list as a whole is full, send it when there is 
> enough data for a single location
> It would be:
> {noformat}
> for Operation o: List<Op>{
>   get location
>   add o to todo location.todolist
>   if (location.todolist > maxLocationSize)
>     send location.todolist to region server 
>     clear location.todolist
>     // don't wait, continue the loop
> }
> send remaining
> wait
> {noformat}
> It's not trivial to write if you add error management: retried list must be 
> shared with the operations added in the todolist. But it's doable.
> It's interesting mainly for 'big' writes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to