[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14980535#comment-14980535
]
Daryn Sharp commented on HDFS-9289:
---
I worked with Chang on this issue and can't think of a scenario in
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981222#comment-14981222
]
Chang Li commented on HDFS-9289:
Thanks [~jingzhao], [~zhz] and [~daryn] for reivew and valuable
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981262#comment-14981262
]
Zhe Zhang commented on HDFS-9289:
-
bq. That's silent data corruption!
[~daryn] I agree it's a silent data
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981313#comment-14981313
]
Zhe Zhang commented on HDFS-9289:
-
Thanks Jing for the explanation. I agree it's reasonable to throw an
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981321#comment-14981321
]
Zhe Zhang commented on HDFS-9289:
-
A small ask for the next rev:
{code}
// BlockInfo#commitBlock
-
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981291#comment-14981291
]
Jing Zhao commented on HDFS-9289:
-
bq. In general, if a client misreports GS, does it indicate a likelihood
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981798#comment-14981798
]
Hadoop QA commented on HDFS-9289:
-
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem ||
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978617#comment-14978617
]
Chang Li commented on HDFS-9289:
[~zhz], no we don't have this log because we didn't enable the
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978896#comment-14978896
]
Jing Zhao commented on HDFS-9289:
-
Making DataStreamer#block volatile is a good change, the GS validation
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979367#comment-14979367
]
Zhe Zhang commented on HDFS-9289:
-
Thanks Jing for sharing the thoughts.
I think the GS validation in
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979400#comment-14979400
]
Jing Zhao commented on HDFS-9289:
-
bq. What if the updatePipeline RPC call has successfully finished NN
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975837#comment-14975837
]
Walter Su commented on HDFS-9289:
-
The patch hides a potential bigger bug. We should find it out and
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976815#comment-14976815
]
Chang Li commented on HDFS-9289:
[~zhz], I don't have the log show the file was completed with an old GS.
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976735#comment-14976735
]
Zhe Zhang commented on HDFS-9289:
-
bq. the client after updatepipeline with the new gen stamp it later
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976925#comment-14976925
]
Zhe Zhang commented on HDFS-9289:
-
The fact that all 3 DNs have old GS doesn't mean the client also has an
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977156#comment-14977156
]
Chang Li commented on HDFS-9289:
[~zhz], yes, the above log is from the same cluster as the first log I
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977260#comment-14977260
]
Zhe Zhang commented on HDFS-9289:
-
bq. I think there probabaly exist some cache coherence issue
This sounds
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977679#comment-14977679
]
Hadoop QA commented on HDFS-9289:
-
\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976471#comment-14976471
]
Chang Li commented on HDFS-9289:
Hi, [~walter.k.su], I don't know in which cluster this strange case will
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975348#comment-14975348
]
Zhe Zhang commented on HDFS-9289:
-
[~lichangleo] I think the below log shows that the client does have new
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975381#comment-14975381
]
Chang Li commented on HDFS-9289:
[~zhz], you are the right, the client had the new genstamp, but the
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974490#comment-14974490
]
Chang Li commented on HDFS-9289:
have met another case in our cluster
{code}
2015-10-23 04:38:08,544 [IPC
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974679#comment-14974679
]
Jing Zhao commented on HDFS-9289:
-
Hi [~lichangleo], what is the current conf setting of the
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974497#comment-14974497
]
Chang Li commented on HDFS-9289:
[~zhz], before we figure out the root cause of this strange case, should
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974525#comment-14974525
]
Zhe Zhang commented on HDFS-9289:
-
[~lichangleo] Thanks for sharing the logs! I'll look at the patch and
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974704#comment-14974704
]
Chang Li commented on HDFS-9289:
Hi [~jingzhao], we are currently taking the default one, and the default
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972655#comment-14972655
]
Chang Li commented on HDFS-9289:
Hi [~zhz], here is the log,
{code}
INFO hdfs.StateChange: BLOCK*
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971590#comment-14971590
]
Elliott Clark commented on HDFS-9289:
-
It had all of the data and the same md5sums when I checked. So
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971302#comment-14971302
]
Chang Li commented on HDFS-9289:
[~eclark], block on 10.210.31.38 should be marked as corrupt because it's
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972209#comment-14972209
]
Zhe Zhang commented on HDFS-9289:
-
[~lichangleo] Thanks for reporting the issue.
bq. but the file
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969890#comment-14969890
]
Elliott Clark commented on HDFS-9289:
-
We just had this something very similar happen on a prod
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969979#comment-14969979
]
Chang Li commented on HDFS-9289:
Hi [~eclark], I think the case you gave is not the same and the corrupt
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969984#comment-14969984
]
Chang Li commented on HDFS-9289:
will update patch with info of expected and encountered gen stamp and unit
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969946#comment-14969946
]
Hadoop QA commented on HDFS-9289:
-
\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969933#comment-14969933
]
Elliott Clark commented on HDFS-9289:
-
Also can we add the expected and encountered genstamps to the
[
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970213#comment-14970213
]
Elliott Clark commented on HDFS-9289:
-
{code}
15/10/22 09:37:36 INFO BlockStateChange: BLOCK
36 matches
Mail list logo