How to redeploy coprocessor?
Hi all, I'm using hbase-0.94.If I update my coprocesor, I have to disable the table and uninstall coprocessor.Then install new coprocessor with a jar that different name with the old one. Are there any other solution?
Re: [jira] [Created] (HDFS-8722) Optimize datanode writes for small writes and flushes
Hi Nick, If we are doing short-circuit, we skip Hadoop CRC, right? So this should impact us only in case we are not doing short-circuit? Or wall doesn't bypass it? JM 2015-08-03 19:04 GMT-04:00 Nick Dimiduk ndimi...@apache.org: FYI, this looks like it would impact small WAL writes. On Tue, Jul 7, 2015 at 10:44 AM, Kihwal Lee (JIRA) j...@apache.org wrote: Kihwal Lee created HDFS-8722: Summary: Optimize datanode writes for small writes and flushes Key: HDFS-8722 URL: https://issues.apache.org/jira/browse/HDFS-8722 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Priority: Critical After the data corruption fix by HDFS-4660, the CRC recalculation for partial chunk is executed more frequently, if the client repeats writing few bytes and calling hflush/hsync. This is because the generic logic forces CRC recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, datanode blindly accepted whatever CRC client provided, if the incoming data is chunk-aligned. This was the source of the corruption. We can still optimize for the most common case where a client is repeatedly writing small number of bytes followed by hflush/hsync with no pipeline recovery or append, by allowing the previous behavior for this specific case. If the incoming data has a duplicate portion and that is at the last chunk-boundary before the partial chunk on disk, datanode can use the checksum supplied by the client without redoing the checksum on its own. This reduces disk reads as well as CPU load for the checksum calculation. If the incoming packet data goes back further than the last on-disk chunk boundary, datanode will still do a recalculation, but this occurs rarely during pipeline recoveries. Thus the optimization for this specific case should be sufficient to speed up the vast majority of cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-11334) Migrate to SLF4J as logging interface
[ https://issues.apache.org/jira/browse/HBASE-11334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell resolved HBASE-11334. Resolution: Won't Fix We also resolved HBASE-2608 as Wont Fix Migrate to SLF4J as logging interface - Key: HBASE-11334 URL: https://issues.apache.org/jira/browse/HBASE-11334 Project: HBase Issue Type: Improvement Reporter: jay vyas Migrating to new log implementations is underway as in HBASE-10092. Next step would be to abstract them so that the hadoop community can standardize on a logging layer that is easy for end users to tune. Simplest way to do this is use SLF4j APIs as the main interface and binding/ implementation details in the docs as necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [jira] [Created] (HDFS-8722) Optimize datanode writes for small writes and flushes
Hi Jean-Marc, Short-circuit covers reads, but this performance improvement covers writes. best, Colin On Wed, Aug 5, 2015 at 7:17 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Nick, If we are doing short-circuit, we skip Hadoop CRC, right? So this should impact us only in case we are not doing short-circuit? Or wall doesn't bypass it? JM 2015-08-03 19:04 GMT-04:00 Nick Dimiduk ndimi...@apache.org: FYI, this looks like it would impact small WAL writes. On Tue, Jul 7, 2015 at 10:44 AM, Kihwal Lee (JIRA) j...@apache.org wrote: Kihwal Lee created HDFS-8722: Summary: Optimize datanode writes for small writes and flushes Key: HDFS-8722 URL: https://issues.apache.org/jira/browse/HDFS-8722 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Priority: Critical After the data corruption fix by HDFS-4660, the CRC recalculation for partial chunk is executed more frequently, if the client repeats writing few bytes and calling hflush/hsync. This is because the generic logic forces CRC recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, datanode blindly accepted whatever CRC client provided, if the incoming data is chunk-aligned. This was the source of the corruption. We can still optimize for the most common case where a client is repeatedly writing small number of bytes followed by hflush/hsync with no pipeline recovery or append, by allowing the previous behavior for this specific case. If the incoming data has a duplicate portion and that is at the last chunk-boundary before the partial chunk on disk, datanode can use the checksum supplied by the client without redoing the checksum on its own. This reduces disk reads as well as CPU load for the checksum calculation. If the incoming packet data goes back further than the last on-disk chunk boundary, datanode will still do a recalculation, but this occurs rarely during pipeline recoveries. Thus the optimization for this specific case should be sufficient to speed up the vast majority of cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: HTrace
Hi Masatake, I was not able to get the client traces but I can find a workaround to generate those spans myself and get a graph. I had one more question related to htrace. If both client process and server use the same file to log htraces, then is there a possibility of corruption while writing to the span log file? Since there will be 2 JVMs writing to the same file. In addition, our client is multithreaded too. I can use synchronize on the client side and make sure multiple threads write the client spans in exclusion, however there can be a problem if a server span is being written at the same time. Let me know what you recommend. Thank you. Best Regards, Priyanka On Mon, Jul 27, 2015 at 10:51 AM, Masatake Iwasaki [via Apache HBase] ml-node+s679495n4073518...@n3.nabble.com wrote: Thank you for getting back to me. Yes I see the Client_htrace.out getting created on the client node. However it is empty. We are using htrace-2.04 version. I believe that writes the span asynchronously. Also the client node is running tomcat for serving requests. Would this be a problem? Hmm... Do you have multiple processes on the client? SpanReceiverHost must be initialized in each process. If you call SpanReceiverHost#getInstance in one process and call Trace#startSpan in another process, the client span is not written to file. I think running Tomcat is not related to the issue. Is there any need to call closeReceivers in the client side code ? I tried it but that did not seem to work. SpanReceiverHost#closeReceivers should be called on just before process exit but spans will be written to file immediately without that. On 7/28/15 01:47, Priyanka B wrote: Hi Masatake, Thank you for getting back to me. Yes I see the Client_htrace.out getting created on the client node. However it is empty. We are using htrace-2.04 version. I believe that writes the span asynchronously. Also the client node is running tomcat for serving requests. Would this be a problem? Is there any need to call closeReceivers in the client side code ? I tried it but that did not seem to work. Thanks, Priyanka -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HTrace-tp4056705p4073515.html Sent from the HBase Developer mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://apache-hbase.679495.n3.nabble.com/HTrace-tp4056705p4073518.html To unsubscribe from HTrace, click here http://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4056705code=cHJpeWFua2EuYmhhbGVyYW9AZ21haWwuY29tfDQwNTY3MDV8LTEwODc3MjQ5NTQ= . NAML http://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
[jira] [Created] (HBASE-14187) Add Thrift 1 RPC to batch gets in a single call
Sean Busbey created HBASE-14187: --- Summary: Add Thrift 1 RPC to batch gets in a single call Key: HBASE-14187 URL: https://issues.apache.org/jira/browse/HBASE-14187 Project: HBase Issue Type: Improvement Components: Thrift Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 2.0.0, 1.3.0 add a method to pull a set of columns from a set of non-contiguous rows in a single RPC call. e.g. {code} /** * Parallel get. For a given table and column, return for * the given rows. * * @param tableName table to get from * @param column column to get * @param rows a list of rows to get * @result list of TRowResult for each item */ listTRowResult parallelGet(1:Text tableName, 2:Text column, 3:listText rows) throws (1:IOError io) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [jira] [Created] (HDFS-8722) Optimize datanode writes for small writes and flushes
I see. Make sense then. Thanks, JM 2015-08-05 12:52 GMT-04:00 Colin McCabe cmcc...@alumni.cmu.edu: Hi Jean-Marc, Short-circuit covers reads, but this performance improvement covers writes. best, Colin On Wed, Aug 5, 2015 at 7:17 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Nick, If we are doing short-circuit, we skip Hadoop CRC, right? So this should impact us only in case we are not doing short-circuit? Or wall doesn't bypass it? JM 2015-08-03 19:04 GMT-04:00 Nick Dimiduk ndimi...@apache.org: FYI, this looks like it would impact small WAL writes. On Tue, Jul 7, 2015 at 10:44 AM, Kihwal Lee (JIRA) j...@apache.org wrote: Kihwal Lee created HDFS-8722: Summary: Optimize datanode writes for small writes and flushes Key: HDFS-8722 URL: https://issues.apache.org/jira/browse/HDFS-8722 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Priority: Critical After the data corruption fix by HDFS-4660, the CRC recalculation for partial chunk is executed more frequently, if the client repeats writing few bytes and calling hflush/hsync. This is because the generic logic forces CRC recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, datanode blindly accepted whatever CRC client provided, if the incoming data is chunk-aligned. This was the source of the corruption. We can still optimize for the most common case where a client is repeatedly writing small number of bytes followed by hflush/hsync with no pipeline recovery or append, by allowing the previous behavior for this specific case. If the incoming data has a duplicate portion and that is at the last chunk-boundary before the partial chunk on disk, datanode can use the checksum supplied by the client without redoing the checksum on its own. This reduces disk reads as well as CPU load for the checksum calculation. If the incoming packet data goes back further than the last on-disk chunk boundary, datanode will still do a recalculation, but this occurs rarely during pipeline recoveries. Thus the optimization for this specific case should be sufficient to speed up the vast majority of cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[ANNOUNCE] Apache Phoenix 4.5 released
The Apache Phoenix team is pleased to announce the immediate availability of the 4.5 release with support for HBase 0.98/1.0/1.1. Together with the 4.4 release, highlights include: Spark Integration (4.4) [1] User Defined Functions (4.4) [2] Query Server with thin driver (4.4) [3] Pherf tool for performance and functional testing at scale (4.4) [4] Asynchronous index population through MR based index builder (4.5) [5] Collection of client-side metrics aggregated per statement (4.5) [6] Improvements to modeling through VIEWs (4.5) [7][8] More details of the release may be found here [9] and the release may be downloaded here [10]. Regards, The Apache Phoenix Team [1] http://phoenix.apache.org/phoenix_spark.html [2] http://phoenix.apache.org/udf.html [3] http://phoenix.apache.org/server.html [4] http://phoenix.apache.org/pherf.html [5] http://phoenix.apache.org/secondary_indexing.html#Asynchronous_Index_Population [6] https://issues.apache.org/jira/browse/PHOENIX-1819 [7] https://issues.apache.org/jira/browse/PHOENIX-1504 [8] https://issues.apache.org/jira/browse/PHOENIX-978 [9] https://blogs.apache.org/phoenix/entry/announcing_phoenix_4_5_released [10] http://phoenix.apache.org/download.html
[jira] [Created] (HBASE-14188) Read path optimizations after HBASE-11425 profiling
ramkrishna.s.vasudevan created HBASE-14188: -- Summary: Read path optimizations after HBASE-11425 profiling Key: HBASE-14188 URL: https://issues.apache.org/jira/browse/HBASE-14188 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan This subtask deals with some improvments that can be done in the read path (scans) after the changes for HBASE-11425 went in. - Avoid CellUtil.setSequenceId in hot path. - Use BBUtils in the MultibyteBuff. - Use ByteBuff.skip() API in HFileReader rather than MultiByteBuff.position(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: HTrace
Hi Priyanka, If both client process and server use the same file to log htraces, then is there a possibility of corruption while writing to the span log file? The client and server are running on the same host? You should use different output file for each process. It is not safe to write to the same file from multiple processes though it is not problem to write spans from multiple threads in single process. Masatake On 8/6/15 01:56, Priyanka Bhalerao-Deshpande wrote: Hi Masatake, I was not able to get the client traces but I can find a workaround to generate those spans myself and get a graph. I had one more question related to htrace. If both client process and server use the same file to log htraces, then is there a possibility of corruption while writing to the span log file? Since there will be 2 JVMs writing to the same file. In addition, our client is multithreaded too. I can use synchronize on the client side and make sure multiple threads write the client spans in exclusion, however there can be a problem if a server span is being written at the same time. Let me know what you recommend. Thank you. Best Regards, Priyanka On Mon, Jul 27, 2015 at 10:51 AM, Masatake Iwasaki [via Apache HBase] ml-node+s679495n4073518...@n3.nabble.com wrote: Thank you for getting back to me. Yes I see the Client_htrace.out getting created on the client node. However it is empty. We are using htrace-2.04 version. I believe that writes the span asynchronously. Also the client node is running tomcat for serving requests. Would this be a problem? Hmm... Do you have multiple processes on the client? SpanReceiverHost must be initialized in each process. If you call SpanReceiverHost#getInstance in one process and call Trace#startSpan in another process, the client span is not written to file. I think running Tomcat is not related to the issue. Is there any need to call closeReceivers in the client side code ? I tried it but that did not seem to work. SpanReceiverHost#closeReceivers should be called on just before process exit but spans will be written to file immediately without that. On 7/28/15 01:47, Priyanka B wrote: Hi Masatake, Thank you for getting back to me. Yes I see the Client_htrace.out getting created on the client node. However it is empty. We are using htrace-2.04 version. I believe that writes the span asynchronously. Also the client node is running tomcat for serving requests. Would this be a problem? Is there any need to call closeReceivers in the client side code ? I tried it but that did not seem to work. Thanks, Priyanka -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HTrace-tp4056705p4073515.html Sent from the HBase Developer mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://apache-hbase.679495.n3.nabble.com/HTrace-tp4056705p4073518.html To unsubscribe from HTrace, click here http://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4056705code=cHJpeWFua2EuYmhhbGVyYW9AZ21haWwuY29tfDQwNTY3MDV8LTEwODc3MjQ5NTQ= . NAML http://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml