How to redeploy coprocessor?

2015-08-05 Thread Zephyr Guo
Hi all,

I'm using hbase-0.94.If I update my coprocesor, I have to disable the table
and uninstall coprocessor.Then install new coprocessor with a jar that
different name with the old one.

Are there any other solution?


Re: [jira] [Created] (HDFS-8722) Optimize datanode writes for small writes and flushes

2015-08-05 Thread Jean-Marc Spaggiari
Hi Nick,

If we are doing short-circuit, we skip Hadoop CRC, right? So this should
impact us only in case we are not doing short-circuit? Or wall doesn't
bypass it?

JM

2015-08-03 19:04 GMT-04:00 Nick Dimiduk ndimi...@apache.org:

 FYI, this looks like it would impact small WAL writes.

 On Tue, Jul 7, 2015 at 10:44 AM, Kihwal Lee (JIRA) j...@apache.org
 wrote:

  Kihwal Lee created HDFS-8722:
  
 
   Summary: Optimize datanode writes for small writes and
 flushes
   Key: HDFS-8722
   URL: https://issues.apache.org/jira/browse/HDFS-8722
   Project: Hadoop HDFS
Issue Type: Improvement
  Reporter: Kihwal Lee
  Priority: Critical
 
 
  After the data corruption fix by HDFS-4660, the CRC recalculation for
  partial chunk is executed more frequently, if the client repeats writing
  few bytes and calling hflush/hsync.  This is because the generic logic
  forces CRC recalculation if on-disk data is not CRC chunk aligned. Prior
 to
  HDFS-4660, datanode blindly accepted whatever CRC client provided, if the
  incoming data is chunk-aligned. This was the source of the corruption.
 
  We can still optimize for the most common case where a client is
  repeatedly writing small number of bytes followed by hflush/hsync with no
  pipeline recovery or append, by allowing the previous behavior for this
  specific case.  If the incoming data has a duplicate portion and that is
 at
  the last chunk-boundary before the partial chunk on disk, datanode can
 use
  the checksum supplied by the client without redoing the checksum on its
  own.  This reduces disk reads as well as CPU load for the checksum
  calculation.
 
  If the incoming packet data goes back further than the last on-disk chunk
  boundary, datanode will still do a recalculation, but this occurs rarely
  during pipeline recoveries. Thus the optimization for this specific case
  should be sufficient to speed up the vast majority of cases.
 
 
 
  --
  This message was sent by Atlassian JIRA
  (v6.3.4#6332)
 



[jira] [Resolved] (HBASE-11334) Migrate to SLF4J as logging interface

2015-08-05 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-11334.

Resolution: Won't Fix

We also resolved HBASE-2608 as Wont Fix

 Migrate to SLF4J as logging interface
 -

 Key: HBASE-11334
 URL: https://issues.apache.org/jira/browse/HBASE-11334
 Project: HBase
  Issue Type: Improvement
Reporter: jay vyas

 Migrating to new log implementations is underway as in HBASE-10092. 
 Next step would be to abstract them so that the hadoop community can 
 standardize on a logging layer that is easy for end users to tune.
 Simplest way to do this is use SLF4j APIs as the main interface and binding/ 
 implementation details in the docs as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [jira] [Created] (HDFS-8722) Optimize datanode writes for small writes and flushes

2015-08-05 Thread Colin McCabe
Hi Jean-Marc,

Short-circuit covers reads, but this performance improvement covers writes.

best,
Colin

On Wed, Aug 5, 2015 at 7:17 AM, Jean-Marc Spaggiari
jean-m...@spaggiari.org wrote:
 Hi Nick,

 If we are doing short-circuit, we skip Hadoop CRC, right? So this should
 impact us only in case we are not doing short-circuit? Or wall doesn't
 bypass it?

 JM

 2015-08-03 19:04 GMT-04:00 Nick Dimiduk ndimi...@apache.org:

 FYI, this looks like it would impact small WAL writes.

 On Tue, Jul 7, 2015 at 10:44 AM, Kihwal Lee (JIRA) j...@apache.org
 wrote:

  Kihwal Lee created HDFS-8722:
  
 
   Summary: Optimize datanode writes for small writes and
 flushes
   Key: HDFS-8722
   URL: https://issues.apache.org/jira/browse/HDFS-8722
   Project: Hadoop HDFS
Issue Type: Improvement
  Reporter: Kihwal Lee
  Priority: Critical
 
 
  After the data corruption fix by HDFS-4660, the CRC recalculation for
  partial chunk is executed more frequently, if the client repeats writing
  few bytes and calling hflush/hsync.  This is because the generic logic
  forces CRC recalculation if on-disk data is not CRC chunk aligned. Prior
 to
  HDFS-4660, datanode blindly accepted whatever CRC client provided, if the
  incoming data is chunk-aligned. This was the source of the corruption.
 
  We can still optimize for the most common case where a client is
  repeatedly writing small number of bytes followed by hflush/hsync with no
  pipeline recovery or append, by allowing the previous behavior for this
  specific case.  If the incoming data has a duplicate portion and that is
 at
  the last chunk-boundary before the partial chunk on disk, datanode can
 use
  the checksum supplied by the client without redoing the checksum on its
  own.  This reduces disk reads as well as CPU load for the checksum
  calculation.
 
  If the incoming packet data goes back further than the last on-disk chunk
  boundary, datanode will still do a recalculation, but this occurs rarely
  during pipeline recoveries. Thus the optimization for this specific case
  should be sufficient to speed up the vast majority of cases.
 
 
 
  --
  This message was sent by Atlassian JIRA
  (v6.3.4#6332)
 



Re: HTrace

2015-08-05 Thread Priyanka Bhalerao-Deshpande
Hi Masatake,

I was not able to get the client traces but I can find a workaround to
generate those spans myself and get a graph. I had one more question
related to htrace.

If both client process and server use the same file to log htraces, then is
there a possibility of corruption while writing to the span log file? Since
there will be 2 JVMs writing to the same file. In addition, our client is
multithreaded too. I can use synchronize on the client side and make sure
multiple threads write the client spans in exclusion, however there can be
a problem if a server span is being written at the same time. Let me know
what you recommend. Thank you.

Best Regards,
Priyanka

On Mon, Jul 27, 2015 at 10:51 AM, Masatake Iwasaki [via Apache HBase] 
ml-node+s679495n4073518...@n3.nabble.com wrote:

   Thank you for getting back to me. Yes I see the Client_htrace.out
 getting
   created on the client node. However it is empty. We are using
 htrace-2.04
   version. I believe that writes the span asynchronously. Also the
 client node
   is running tomcat for serving requests. Would this be a problem?

 Hmm... Do you have multiple processes on the client?
 SpanReceiverHost must be initialized in each process.
 If you call SpanReceiverHost#getInstance in one process and
 call Trace#startSpan in another process,
 the client span is not written to file.

 I think running Tomcat is not related to the issue.


   Is there any need to call closeReceivers in the client side code ? I
 tried
   it but that did not seem to work.

 SpanReceiverHost#closeReceivers should be called on just before process
 exit
 but spans will be written to file immediately without that.



 On 7/28/15 01:47, Priyanka B wrote:
   Hi Masatake,
  
   Thank you for getting back to me. Yes I see the Client_htrace.out
 getting
   created on the client node. However it is empty. We are using
 htrace-2.04
   version. I believe that writes the span asynchronously. Also the
 client node
   is running tomcat for serving requests. Would this be a problem?
  
   Is there any need to call closeReceivers in the client side code ? I
 tried
   it but that did not seem to work.
  
  
   Thanks,
   Priyanka
  
  
  
   --
   View this message in context:
 http://apache-hbase.679495.n3.nabble.com/HTrace-tp4056705p4073515.html
   Sent from the HBase Developer mailing list archive at Nabble.com.




 --
 If you reply to this email, your message will be added to the discussion
 below:
 http://apache-hbase.679495.n3.nabble.com/HTrace-tp4056705p4073518.html
 To unsubscribe from HTrace, click here
 http://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4056705code=cHJpeWFua2EuYmhhbGVyYW9AZ21haWwuY29tfDQwNTY3MDV8LTEwODc3MjQ5NTQ=
 .
 NAML
 http://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml



[jira] [Created] (HBASE-14187) Add Thrift 1 RPC to batch gets in a single call

2015-08-05 Thread Sean Busbey (JIRA)
Sean Busbey created HBASE-14187:
---

 Summary: Add Thrift 1 RPC to batch gets in a single call
 Key: HBASE-14187
 URL: https://issues.apache.org/jira/browse/HBASE-14187
 Project: HBase
  Issue Type: Improvement
  Components: Thrift
Reporter: Sean Busbey
Assignee: Sean Busbey
 Fix For: 2.0.0, 1.3.0


add a method to pull a set of columns from a set of non-contiguous rows in a 
single RPC call.

e.g.
{code}
   /**
 * Parallel get. For a given table and column, return for
 * the given rows.
 *
 * @param tableName table to get from
 * @param column column to get
 * @param rows a list of rows to get
 * @result list of TRowResult for each item
 */
listTRowResult parallelGet(1:Text tableName,
 2:Text column,
 3:listText rows)
 throws (1:IOError io)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [jira] [Created] (HDFS-8722) Optimize datanode writes for small writes and flushes

2015-08-05 Thread Jean-Marc Spaggiari
I see. Make sense then.

Thanks,

JM

2015-08-05 12:52 GMT-04:00 Colin McCabe cmcc...@alumni.cmu.edu:

 Hi Jean-Marc,

 Short-circuit covers reads, but this performance improvement covers writes.

 best,
 Colin

 On Wed, Aug 5, 2015 at 7:17 AM, Jean-Marc Spaggiari
 jean-m...@spaggiari.org wrote:
  Hi Nick,
 
  If we are doing short-circuit, we skip Hadoop CRC, right? So this should
  impact us only in case we are not doing short-circuit? Or wall doesn't
  bypass it?
 
  JM
 
  2015-08-03 19:04 GMT-04:00 Nick Dimiduk ndimi...@apache.org:
 
  FYI, this looks like it would impact small WAL writes.
 
  On Tue, Jul 7, 2015 at 10:44 AM, Kihwal Lee (JIRA) j...@apache.org
  wrote:
 
   Kihwal Lee created HDFS-8722:
   
  
Summary: Optimize datanode writes for small writes and
  flushes
Key: HDFS-8722
URL: https://issues.apache.org/jira/browse/HDFS-8722
Project: Hadoop HDFS
 Issue Type: Improvement
   Reporter: Kihwal Lee
   Priority: Critical
  
  
   After the data corruption fix by HDFS-4660, the CRC recalculation for
   partial chunk is executed more frequently, if the client repeats
 writing
   few bytes and calling hflush/hsync.  This is because the generic logic
   forces CRC recalculation if on-disk data is not CRC chunk aligned.
 Prior
  to
   HDFS-4660, datanode blindly accepted whatever CRC client provided, if
 the
   incoming data is chunk-aligned. This was the source of the corruption.
  
   We can still optimize for the most common case where a client is
   repeatedly writing small number of bytes followed by hflush/hsync
 with no
   pipeline recovery or append, by allowing the previous behavior for
 this
   specific case.  If the incoming data has a duplicate portion and that
 is
  at
   the last chunk-boundary before the partial chunk on disk, datanode can
  use
   the checksum supplied by the client without redoing the checksum on
 its
   own.  This reduces disk reads as well as CPU load for the checksum
   calculation.
  
   If the incoming packet data goes back further than the last on-disk
 chunk
   boundary, datanode will still do a recalculation, but this occurs
 rarely
   during pipeline recoveries. Thus the optimization for this specific
 case
   should be sufficient to speed up the vast majority of cases.
  
  
  
   --
   This message was sent by Atlassian JIRA
   (v6.3.4#6332)
  
 



[ANNOUNCE] Apache Phoenix 4.5 released

2015-08-05 Thread James Taylor
The Apache Phoenix team is pleased to announce the immediate availability
of the 4.5 release with support for HBase 0.98/1.0/1.1.

Together with the 4.4 release, highlights include:

Spark Integration (4.4) [1]
User Defined Functions (4.4) [2]
Query Server with thin driver (4.4) [3]
Pherf tool for performance and functional testing at scale (4.4) [4]
Asynchronous index population through MR based index builder (4.5) [5]
Collection of client-side metrics aggregated per statement (4.5) [6]
Improvements to modeling through VIEWs (4.5) [7][8]

More details of the release may be found here [9] and the release may be
downloaded here [10].

Regards,
The Apache Phoenix Team

[1] http://phoenix.apache.org/phoenix_spark.html
[2] http://phoenix.apache.org/udf.html
[3] http://phoenix.apache.org/server.html
[4] http://phoenix.apache.org/pherf.html
[5]
http://phoenix.apache.org/secondary_indexing.html#Asynchronous_Index_Population
[6] https://issues.apache.org/jira/browse/PHOENIX-1819
[7] https://issues.apache.org/jira/browse/PHOENIX-1504
[8] https://issues.apache.org/jira/browse/PHOENIX-978
[9] https://blogs.apache.org/phoenix/entry/announcing_phoenix_4_5_released
[10] http://phoenix.apache.org/download.html


[jira] [Created] (HBASE-14188) Read path optimizations after HBASE-11425 profiling

2015-08-05 Thread ramkrishna.s.vasudevan (JIRA)
ramkrishna.s.vasudevan created HBASE-14188:
--

 Summary: Read path optimizations after HBASE-11425 profiling
 Key: HBASE-14188
 URL: https://issues.apache.org/jira/browse/HBASE-14188
 Project: HBase
  Issue Type: Sub-task
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan


This subtask deals with some improvments that can be done in the read path 
(scans) after the changes for HBASE-11425 went in.
- Avoid CellUtil.setSequenceId in hot path.
- Use BBUtils in the MultibyteBuff.
- Use ByteBuff.skip() API in HFileReader rather than MultiByteBuff.position().




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: HTrace

2015-08-05 Thread Masatake Iwasaki

Hi Priyanka,

 If both client process and server use the same file to log htraces, 
then is

 there a possibility of corruption while writing to the span log file?

The client and server are running on the same host?
You should use different output file for each process.
It is not safe to write to the same file from multiple processes
though it is not problem to write spans from multiple threads in single 
process.


Masatake


On 8/6/15 01:56, Priyanka Bhalerao-Deshpande wrote:
 Hi Masatake,

 I was not able to get the client traces but I can find a workaround to
 generate those spans myself and get a graph. I had one more question
 related to htrace.

 If both client process and server use the same file to log htraces, 
then is
 there a possibility of corruption while writing to the span log file? 
Since

 there will be 2 JVMs writing to the same file. In addition, our client is
 multithreaded too. I can use synchronize on the client side and make sure
 multiple threads write the client spans in exclusion, however there 
can be

 a problem if a server span is being written at the same time. Let me know
 what you recommend. Thank you.

 Best Regards,
 Priyanka

 On Mon, Jul 27, 2015 at 10:51 AM, Masatake Iwasaki [via Apache HBase] 
 ml-node+s679495n4073518...@n3.nabble.com wrote:

   Thank you for getting back to me. Yes I see the Client_htrace.out
 getting
   created on the client node. However it is empty. We are using
 htrace-2.04
   version. I believe that writes the span asynchronously. Also the
 client node
   is running tomcat for serving requests. Would this be a problem?

 Hmm... Do you have multiple processes on the client?
 SpanReceiverHost must be initialized in each process.
 If you call SpanReceiverHost#getInstance in one process and
 call Trace#startSpan in another process,
 the client span is not written to file.

 I think running Tomcat is not related to the issue.


   Is there any need to call closeReceivers in the client side code ? I
 tried
   it but that did not seem to work.

 SpanReceiverHost#closeReceivers should be called on just before process
 exit
 but spans will be written to file immediately without that.



 On 7/28/15 01:47, Priyanka B wrote:
   Hi Masatake,
  
   Thank you for getting back to me. Yes I see the Client_htrace.out
 getting
   created on the client node. However it is empty. We are using
 htrace-2.04
   version. I believe that writes the span asynchronously. Also the
 client node
   is running tomcat for serving requests. Would this be a problem?
  
   Is there any need to call closeReceivers in the client side code ? I
 tried
   it but that did not seem to work.
  
  
   Thanks,
   Priyanka
  
  
  
   --
   View this message in context:
 http://apache-hbase.679495.n3.nabble.com/HTrace-tp4056705p4073515.html
   Sent from the HBase Developer mailing list archive at Nabble.com.




 --
 If you reply to this email, your message will be added to the discussion
 below:
 http://apache-hbase.679495.n3.nabble.com/HTrace-tp4056705p4073518.html
 To unsubscribe from HTrace, click here
 
http://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4056705code=cHJpeWFua2EuYmhhbGVyYW9AZ21haWwuY29tfDQwNTY3MDV8LTEwODc3MjQ5NTQ=

 .
 NAML
 
http://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml