[jira] [Comment Edited] (HADOOP-16423) S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)

2019-08-06 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900897#comment-16900897
 ] 

Gabor Bota edited comment on HADOOP-16423 at 8/6/19 11:10 AM:
--

Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
 * version id check is removed because if we want to have version id for an 
object then we need to do a HEAD req for that object. We walk the tree on S3 
directory listings which is far more efficient in terms of the number of 
requests - we do only one request per-directory and only do this for 
directories. For version id we should do a request for every single object, so 
it will be removed altogether for now.
 * Error messages will stay in the handlers instead of adding those to the 
enums: we need to access the pair (the FileStatus both from the MS and S3) when 
writing the log message to log where's the error and we need to log parts of 
the filestatus for showing eg. a mismatch.
 * Added AUTHORITATIVE_DIRECTORY_CONTENT_MISMATCH violation.
 * Teardown in itests: close rawfs


was (Author: gabor.bota):
Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
 * version id check is removed because if we want to have version id for an 
object then we need to do a HEAD req for that object. We walk the tree on S3 
directory listings which is far more efficient in terms of the number of 
requests - we do only one request per-directory and only do this for 
directories. For version id we should do a request for every single object, so 
it will be removed altogether for now.
 * Error messages added to the enums instead of the violation handler.
 * Added AUTHORITATIVE_DIRECTORY_CONTENT_MISMATCH violation.
 * Teardown in itests: close rawfs

> S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)
> 
>
> Key: HADOOP-16423
> URL: https://issues.apache.org/jira/browse/HADOOP-16423
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> This part is only for logging the inconsistencies.
> This issue only covers the part when the walk is being done in the S3 and 
> compares all metadata to the MS.
> There will be no part where the walk is being done in the MS and compare it 
> to the S3. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16423) S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)

2019-08-06 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900897#comment-16900897
 ] 

Gabor Bota edited comment on HADOOP-16423 at 8/6/19 11:06 AM:
--

Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
 * version id check is removed because if we want to have version id for an 
object then we need to do a HEAD req for that object. We walk the tree on S3 
directory listings which is far more efficient in terms of the number of 
requests - we do only one request per-directory and only do this for 
directories. For version id we should do a request for every single object, so 
it will be removed altogether for now.
 * Error messages added to the enums instead of the violation handler.
 * Added AUTHORITATIVE_DIRECTORY_CONTENT_MISMATCH violation.
 * Teardown in itests: close rawfs


was (Author: gabor.bota):
Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
 * version id check is removed because if we want to have version id for an 
object then we need to do a HEAD req for that object. We walk the tree on S3 
directory listings which is far more efficient in terms of the number of 
requests - we do only one request per-directory and only do this for 
directories. For version id we should do a request for every single object, so 
it will be removed altogether for now.
 * Error messages added to the enums instead of the violation handler.
 * Teardown in itests: close rawfs

> S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)
> 
>
> Key: HADOOP-16423
> URL: https://issues.apache.org/jira/browse/HADOOP-16423
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> This part is only for logging the inconsistencies.
> This issue only covers the part when the walk is being done in the S3 and 
> compares all metadata to the MS.
> There will be no part where the walk is being done in the MS and compare it 
> to the S3. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16423) S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)

2019-08-06 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900897#comment-16900897
 ] 

Gabor Bota edited comment on HADOOP-16423 at 8/6/19 11:06 AM:
--

Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
 * version id check is removed because if we want to have version id for an 
object then we need to do a HEAD req for that object. We walk the tree on S3 
directory listings which is far more efficient in terms of the number of 
requests - we do only one request per-directory and only do this for 
directories. For version id we should do a request for every single object, so 
it will be removed altogether for now.
 * Error messages added to the enums instead of the violation handler.
 * Teardown in itests: close rawfs


was (Author: gabor.bota):
Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
 * Error messages added to the enums instead of the violation handler.
 * Teardown in itests: close rawfs

> S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)
> 
>
> Key: HADOOP-16423
> URL: https://issues.apache.org/jira/browse/HADOOP-16423
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> This part is only for logging the inconsistencies.
> This issue only covers the part when the walk is being done in the S3 and 
> compares all metadata to the MS.
> There will be no part where the walk is being done in the MS and compare it 
> to the S3. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16423) S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)

2019-08-06 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900897#comment-16900897
 ] 

Gabor Bota edited comment on HADOOP-16423 at 8/6/19 11:03 AM:
--

Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
 * Error messages added to the enums instead of the violation handler.
 * Teardown in itests: close rawfs


was (Author: gabor.bota):
Based on an offline discussion what's coming up:
* Severity added to the violation type enums (3 levels, defined in the enum)
* * eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
* Error messages added to the enums instead of the violation handler. 
* Teardown in itests: close rawfs

> S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)
> 
>
> Key: HADOOP-16423
> URL: https://issues.apache.org/jira/browse/HADOOP-16423
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> This part is only for logging the inconsistencies.
> This issue only covers the part when the walk is being done in the S3 and 
> compares all metadata to the MS.
> There will be no part where the walk is being done in the MS and compare it 
> to the S3. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org