Re: How to delete row with Long.MAX_VALUE timestamp

2020-05-14 Thread Alexander Batyrshin
> On 14 May 2020, at 20:21, Bharath Vissapragada wrote: > >> Maybe TS corruption issue some how linked with another issue that we got > - https://issues.apache.org/jira/browse/HBASE-22862 > > > We are running into this too. Our current

Re: How to delete row with Long.MAX_VALUE timestamp

2020-05-14 Thread Bharath Vissapragada
> Can you share code that generate HFiles with delete markers? Here you go. You might want to use table.getDescriptor() and build the column families from that descriptor. I just hardcoded everything for my simple table. You

Re: How to delete row with Long.MAX_VALUE timestamp

2020-05-14 Thread Alexander Batyrshin
All corrupted cells was inserted via Phoenix client. We don’t use bulk load. Maybe TS corruption issue some how linked with another issue that we got - https://issues.apache.org/jira/browse/HBASE-22862 > On 13 May 2020, at 15:11, Wellington

Re: How to delete row with Long.MAX_VALUE timestamp

2020-05-14 Thread Alexander Batyrshin
Thank you for this idea. Its looks promising. Can you share code that generate HFiles with delete markers? As I see delete markers was inserted correctly. But what would happen after major-compaction? > On 13 May 2020, at 08:32, Bharath Vissapragada wrote: > > Interesting behavior, I just

Re: How to delete row with Long.MAX_VALUE timestamp

2020-05-13 Thread Bharath Vissapragada
> It would be nice to confirm how those Cells could get through with Long.MAX_VALUE timestamp Yep. I'd be curious to hear from more folks in the community on how they dealt with debugging one-off data corruption cases and share any tooling/best practices (tracing etc). Usually it is super

Re: How to delete row with Long.MAX_VALUE timestamp

2020-05-13 Thread Wellington Chevreuil
Yeah, creating hfiles manually with Long.MAX_VALUE Delete markers for those cells would be my next suggestion. It would be nice to confirm how those Cells could get through with Long.MAX_VALUE timestamp, it would be surprising if it was WAL replay, I would expect it would reuse the timestamps

Re: How to delete row with Long.MAX_VALUE timestamp

2020-05-12 Thread Bharath Vissapragada
Interesting behavior, I just tried it out on my local setup (master/HEAD) out of curiosity to check if we can trick HBase into deleting this bad row and the following worked for me. I don't know how you ended up with that row though (bad bulk load? just guessing). To have a table with the

Re: How to delete row with Long.MAX_VALUE timestamp

2020-05-12 Thread Alexander Batyrshin
Table is ~ 10TB SNAPPY data. I don’t have such a big time window on production for re-inserting all data. I don’t know how we got those cells. I can only assume that this is phoenix and/or replaying from WAL after region server crash. > On 12 May 2020, at 18:25, Wellington Chevreuil > wrote:

Re: How to delete row with Long.MAX_VALUE timestamp

2020-05-12 Thread Wellington Chevreuil
How large is this table? Can you afford re-insert all current data on a new, temp table? If so, you could write a mapreduce job that scans this table and rewrite all its cells to this new, temp table. I had verified that 1.4.10 does have the timestamp replacing logic here:

Re: How to delete row with Long.MAX_VALUE timestamp

2020-05-12 Thread Alexander Batyrshin
Any ideas how to delete these rows? I see only this way: - backup data from region that contains “damaged” rows - close region - remove region files from HDFS - assign region - copy needed rows from backup to recreated region > On 30 Apr 2020, at 21:00, Alexander Batyrshin <0x62...@gmail.com>

Re: How to delete row with Long.MAX_VALUE timestamp

2020-04-30 Thread Alexander Batyrshin
The same effect for CF: d = org.apache.hadoop.hbase.client.Delete.new("\x0439d58wj434dd".to_s.to_java_bytes) d.deleteFamily("d".to_s.to_java_bytes, 9223372036854775807.to_java(Java::long)) table.delete(d) ROW COLUMN+CELL \x0439d58wj434dd

Re: How to delete row with Long.MAX_VALUE timestamp

2020-04-29 Thread Wellington Chevreuil
Well, it's weird that puts with such TS values were allowed, according to current code state. Can you afford delete the whole CF for those rows? Em qua., 29 de abr. de 2020 às 14:41, junhyeok park escreveu: > I've been through the same thing. I use 2.2.0 > > 2020년 4월 29일 (수) 오후 10:32, Alexander

Re: How to delete row with Long.MAX_VALUE timestamp

2020-04-29 Thread junhyeok park
I've been through the same thing. I use 2.2.0 2020년 4월 29일 (수) 오후 10:32, Alexander Batyrshin <0x62...@gmail.com>님이 작성: > As you can see in example I already tried DELETE operation with timestamp > = Long.MAX_VALUE without any success. > > > On 29 Apr 2020, at 12:41, Wellington Chevreuil < >

Re: How to delete row with Long.MAX_VALUE timestamp

2020-04-29 Thread Alexander Batyrshin
ed to current time. > > > > > > > --原始邮件-- > 发件人:"Wellington Chevreuil" <mailto:wellington.chevre...@gmail.com>; > 发送时间:2020年4月29日(星期三) 下午5:41 > 收件人:"Hbase-User" <mailto:user@hbase.apache.org>; >

Re: How to delete row with Long.MAX_VALUE timestamp

2020-04-29 Thread Alexander Batyrshin
As you can see in example I already tried DELETE operation with timestamp = Long.MAX_VALUE without any success. > On 29 Apr 2020, at 12:41, Wellington Chevreuil > wrote: > > That's expected behaviour [1]. If you are "travelling to the future", you > need to do a delete specifying

?????? How to delete row with Long.MAX_VALUE timestamp

2020-04-29 Thread zheng wang
It seems like the Long.MAX_VALUE is aspecial value, if set it as the timestamp , will be changed to current time. ---- ??:"Wellington Chevreuil"https://hbase.apache.org/book.html#version.delete [2]

Re: How to delete row with Long.MAX_VALUE timestamp

2020-04-29 Thread Wellington Chevreuil
That's expected behaviour [1]. If you are "travelling to the future", you need to do a delete specifying Long.MAX_VALUE timestamp as the timestamp optional parameter in the delete operation [2], if you don't specify timestamp on the delete, it will assume current time for the delete marker, which

How to delete row with Long.MAX_VALUE timestamp

2020-04-29 Thread Alexander Batyrshin
Hello all, We had faced with strange situation: table has rows with Long.MAX_VALUE timestamp. These rows impossible to delete, because DELETE mutation uses System.currentTimeMillis() timestamp. Is there any way to delete these rows? We use HBase-1.4.10 Example: hbase(main):037:0> scan