[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254002#comment-13254002
 ] 

Hudson commented on HBASE-5604:
---

Integrated in HBase-TRUNK-security #171 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/171/])
HBASE-5604 Addendum. Remove test as per discussion on jira. (Revision 
1326002)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestWALPlayer.java


> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253881#comment-13253881
 ] 

Hudson commented on HBASE-5604:
---

Integrated in HBase-TRUNK #2757 (See 
[https://builds.apache.org/job/HBase-TRUNK/2757/])
HBASE-5604 Addendum. Remove test as per discussion on jira. (Revision 
1326002)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestWALPlayer.java


> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253860#comment-13253860
 ] 

Hudson commented on HBASE-5604:
---

Integrated in HBase-0.94 #114 (See 
[https://builds.apache.org/job/HBase-0.94/114/])
HBASE-5604 Addendum. Remove test as per discussion on jira. (Revision 
1326001)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestWALPlayer.java


> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253859#comment-13253859
 ] 

Hudson commented on HBASE-5604:
---

Integrated in HBase-0.94-security #10 (See 
[https://builds.apache.org/job/HBase-0.94-security/10/])
HBASE-5604 Addendum. Remove test as per discussion on jira. (Revision 
1326001)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestWALPlayer.java


> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253839#comment-13253839
 ] 

Lars Hofhansl commented on HBASE-5604:
--

OK. Done. I'll use that build as RC1.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253824#comment-13253824
 ] 

Lars Hofhansl commented on HBASE-5604:
--

@Stack and Ted: I'll just remove that test. The HFile pretty printer will show 
time in the same TZ as WALPlayer would parse it, which is still useful.
I'll not file an extra jira but post an addendum to this one.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253797#comment-13253797
 ] 

Zhihong Yu commented on HBASE-5604:
---

See the following code from 
http://stackoverflow.com/questions/2375222/java-simpledateformat-for-time-zone-with-a-colon-seperator:
{code}
String dateString = "2010-03-01T00:00:00-08:00";
String pattern = "-MM-dd'T'HH:mm:ss";
SimpleDateFormat sdf = new SimpleDateFormat(pattern);
Date date = sdf.parse(dateString);
{code}

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253778#comment-13253778
 ] 

stack commented on HBASE-5604:
--

Remove the Date stuff.  Just do basic ms.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253765#comment-13253765
 ] 

Lars Hofhansl commented on HBASE-5604:
--

Sigh... That's exactly what I was afraid about when adding data parsing.
What timezone is used to store the WAL edit? Is it GMT? If so, the date should 
be interpreted as GMT.
If the data in the WAL is always the local TZ, I should remove the date 
verification test (which in a sense is pointless anyway, because it just tests 
whether SimpleDateFormat works).

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253620#comment-13253620
 ] 

Hudson commented on HBASE-5604:
---

Integrated in HBase-0.94-security #9 (See 
[https://builds.apache.org/job/HBase-0.94-security/9/])
HBASE-5604 addendum, mark TestWALPlayer as LargeTest (Revision 1325608)
HBASE-5604 M/R tool to replay WAL files (Revision 1325562)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestWALPlayer.java

larsh : 
Files : 
* /hbase/branches/0.94/src/docbkx/ops_mgt.xml
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/HLogInputFormat.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/WALPlayer.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestWALPlayer.java


> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253607#comment-13253607
 ] 

Zhihong Yu commented on HBASE-5604:
---

Looking at test failure reported by Hadoop QA: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1514//testReport/org.apache.hadoop.hbase.mapreduce/TestWALPlayer/testTimeFormat/
{code}
java.lang.AssertionError: expected:<1334092861001> but was:<1334067661001>
{code}
I wonder if timezone could be an issue here - the difference is 7 hours.

If you don't want to involve call such as 
setTimeZone(TimeZone.getTimeZone(“America/Los_Angeles”)), please comment out:
{code}
assertEquals(1334092861001L, conf.getLong(HLogInputFormat.END_TIME_KEY, 0));
{code}

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253147#comment-13253147
 ] 

Hudson commented on HBASE-5604:
---

Integrated in HBase-TRUNK-security #169 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/169/])
HBASE-5604 addendum, mark TestWALPlayer as LargeTest (Revision 1325609)
HBASE-5604 M/R tool to replay WAL files (Revision 132)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestWALPlayer.java

larsh : 
Files : 
* /hbase/trunk/src/docbkx/ops_mgt.xml
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/HLogInputFormat.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/WALPlayer.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestWALPlayer.java


> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253137#comment-13253137
 ] 

Hudson commented on HBASE-5604:
---

Integrated in HBase-TRUNK #2750 (See 
[https://builds.apache.org/job/HBase-TRUNK/2750/])
HBASE-5604 addendum, mark TestWALPlayer as LargeTest (Revision 1325609)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestWALPlayer.java


> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253112#comment-13253112
 ] 

Hudson commented on HBASE-5604:
---

Integrated in HBase-0.94 #111 (See 
[https://builds.apache.org/job/HBase-0.94/111/])
HBASE-5604 addendum, mark TestWALPlayer as LargeTest (Revision 1325608)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestWALPlayer.java


> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253095#comment-13253095
 ] 

Lars Hofhansl commented on HBASE-5604:
--

Done. Thanks Ted.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253094#comment-13253094
 ] 

Lars Hofhansl commented on HBASE-5604:
--

Yes, should do an addendum. Will mark as large.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253085#comment-13253085
 ] 

Ted Yu commented on HBASE-5604:
---

TestWALPlayer.java doesn't have test category.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253002#comment-13253002
 ] 

Hudson commented on HBASE-5604:
---

Integrated in HBase-0.94 #109 (See 
[https://builds.apache.org/job/HBase-0.94/109/])
HBASE-5604 M/R tool to replay WAL files (Revision 1325562)

 Result = SUCCESS
larsh : 
Files : 
* /hbase/branches/0.94/src/docbkx/ops_mgt.xml
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/HLogInputFormat.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/WALPlayer.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestWALPlayer.java


> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252977#comment-13252977
 ] 

Hudson commented on HBASE-5604:
---

Integrated in HBase-TRUNK #2749 (See 
[https://builds.apache.org/job/HBase-TRUNK/2749/])
HBASE-5604 M/R tool to replay WAL files (Revision 132)

 Result = FAILURE
larsh : 
Files : 
* /hbase/trunk/src/docbkx/ops_mgt.xml
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/HLogInputFormat.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/WALPlayer.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestWALPlayer.java


> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252934#comment-13252934
 ] 

Lars Hofhansl commented on HBASE-5604:
--

And I did insert the spaces before commit :)

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252916#comment-13252916
 ] 

Ted Yu commented on HBASE-5604:
---

Patch looks good.
Minor comment:
{code}
+  HLog.Entry temp;
+  long i=-1;
{code}
Insert spaces around equal sign above.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252892#comment-13252892
 ] 

Lars Hofhansl commented on HBASE-5604:
--

@Ted: Are you OK with latest patch? If so, I'll commit today.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252812#comment-13252812
 ] 

Hadoop QA commented on HBASE-5604:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12522473/5604-v11.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 3 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.replication.TestReplication
  org.apache.hadoop.hbase.client.TestFromClientSide
  org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1498//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1498//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1498//console

This message is automatically generated.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252698#comment-13252698
 ] 

Hadoop QA commented on HBASE-5604:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12522454/5604-v10.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 3 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1497//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1497//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1497//console

This message is automatically generated.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 5604-v10.txt, 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 
> 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252537#comment-13252537
 ] 

Ted Yu commented on HBASE-5604:
---

If you run the patch through 'arc lint', you would see (just excerpt):
{code}
   Warning  (TXT3) Line Too Long
This line is 106 characters long, but the convention is 100 characters.

 274 System.err.println("  -D" + BULK_OUTPUT_CONF_KEY + 
"=/path/for/output");
 275 System.err.println("  (Exactly one table must be specified 
for this option, and not mapping can be specified!)");
 276 System.err.println("Other options:");
>>>  277 System.err.println("  -D" + HLogInputFormat.START_TIME_KEY 
+ "=ms (only apply edit after this time)");
 278 System.err.println("  -D" + HLogInputFormat.END_TIME_KEY + 
"=ms (only apply edit before this time)");
 279 System.err.println("For performance also consider the 
following options:\n"
 280 + "  -Dmapred.map.tasks.speculative.execution=false\n"

   Warning  (TXT3) Line Too Long
This line is 105 characters long, but the convention is 100 characters.

 275 System.err.println("  (Exactly one table must be specified 
for this option, and not mapping can be specified!)");
 276 System.err.println("Other options:");
 277 System.err.println("  -D" + HLogInputFormat.START_TIME_KEY 
+ "=ms (only apply edit after this time)");
>>>  278 System.err.println("  -D" + HLogInputFormat.END_TIME_KEY + 
"=ms (only apply edit before this time)");
 279 System.err.println("For performance also consider the 
following options:\n"
 280 + "  -Dmapred.map.tasks.speculative.execution=false\n"
 281 + "  
-Dmapred.reduce.tasks.speculative.execution=false");
{code}
Below is a long line:
{code}
+throw new IOException(option + " must be specified either in the form 
2001-02-20T16:35:06.99 or as a number of milliseconds");
{code}
Minor comment: the 'a' in 'as a number' should be omitted.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 
> 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252530#comment-13252530
 ] 

Lars Hofhansl commented on HBASE-5604:
--

Since this is only new code I propose this for 0.94.0 as well.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 
> 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-10 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251341#comment-13251341
 ] 

stack commented on HBASE-5604:
--

None +1 on commit.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 
> 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira