[jira] Updated: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

2009-10-30 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-1057:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch checked in.

> [Zebra] Zebra does not support concurrent deletions of column groups now.
> -
>
> Key: PIG-1057
> URL: https://issues.apache.org/jira/browse/PIG-1057
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Chao Wang
>Assignee: Chao Wang
> Fix For: 0.6.0
>
> Attachments: patch_1057
>
>
> Zebra does not support concurrent deletions of column groups now.  As a 
> result, the TestDropColumnGroup testcase can sometimes fail due to this.
> In this testcase, multiple threads will be launched together, with each one 
> deleting one particular column group.  The following exception can be thrown 
> (with callstack):
> /*/
> ... 
> java.io.FileNotFoundException: File 
> /.../pig-trunk/build/contrib/zebra/test/data/DropCGTest/CG02 does not exist.
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:290)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:716)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:741)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:465)
>   at 
> org.apache.hadoop.zebra.io.BasicTable$SchemaFile.setCGDeletedFlags(BasicTable.java:1610)
>   at 
> org.apache.hadoop.zebra.io.BasicTable$SchemaFile.readSchemaFile(BasicTable.java:1593)
>   at 
> org.apache.hadoop.zebra.io.BasicTable$SchemaFile.(BasicTable.java:1416)
>   at 
> org.apache.hadoop.zebra.io.BasicTable.dropColumnGroup(BasicTable.java:133)
>   at 
> org.apache.hadoop.zebra.io.TestDropColumnGroup$DropThread.run(TestDropColumnGroup.java:772)
> ...
> /*/
> We plan to fix this in Zebra to support concurrent deletions of column 
> groups. The root cause is that a thread or process reads in some stale file 
> system information (e.g., it sees /CG0 first) and then can fail later on (it 
> tries to access /CG0, however /CG0 may be deleted by another thread or 
> process).  Therefore, we plan to adopt a retry logic to resolve this issue. 
> More detailed, we allow a dropping column group thread to retry n times when 
> doing its deleting job - n is the total number of column groups. 
> Note that here we do NOT try to resolve the more general concurrent column 
> group deletions + reads issue. If a process is reading some data that could 
> be deleted by another process, it can fail as we expect.
> Here we only try to resolve the concurrent column group deletions issue. If 
> you have multiple threads or processes to delete column groups, they should 
> succeed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

2009-10-29 Thread Chao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PIG-1057:
---

Status: Patch Available  (was: Open)

> [Zebra] Zebra does not support concurrent deletions of column groups now.
> -
>
> Key: PIG-1057
> URL: https://issues.apache.org/jira/browse/PIG-1057
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Chao Wang
>Assignee: Chao Wang
> Fix For: 0.6.0
>
> Attachments: patch_1057
>
>
> Zebra does not support concurrent deletions of column groups now.  As a 
> result, the TestDropColumnGroup testcase can sometimes fail due to this.
> In this testcase, multiple threads will be launched together, with each one 
> deleting one particular column group.  The following exception can be thrown 
> (with callstack):
> /*/
> ... 
> java.io.FileNotFoundException: File 
> /.../pig-trunk/build/contrib/zebra/test/data/DropCGTest/CG02 does not exist.
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:290)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:716)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:741)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:465)
>   at 
> org.apache.hadoop.zebra.io.BasicTable$SchemaFile.setCGDeletedFlags(BasicTable.java:1610)
>   at 
> org.apache.hadoop.zebra.io.BasicTable$SchemaFile.readSchemaFile(BasicTable.java:1593)
>   at 
> org.apache.hadoop.zebra.io.BasicTable$SchemaFile.(BasicTable.java:1416)
>   at 
> org.apache.hadoop.zebra.io.BasicTable.dropColumnGroup(BasicTable.java:133)
>   at 
> org.apache.hadoop.zebra.io.TestDropColumnGroup$DropThread.run(TestDropColumnGroup.java:772)
> ...
> /*/
> We plan to fix this in Zebra to support concurrent deletions of column 
> groups. The root cause is that a thread or process reads in some stale file 
> system information (e.g., it sees /CG0 first) and then can fail later on (it 
> tries to access /CG0, however /CG0 may be deleted by another thread or 
> process).  Therefore, we plan to adopt a retry logic to resolve this issue. 
> More detailed, we allow a dropping column group thread to retry n times when 
> doing its deleting job - n is the total number of column groups. 
> Note that here we do NOT try to resolve the more general concurrent column 
> group deletions + reads issue. If a process is reading some data that could 
> be deleted by another process, it can fail as we expect.
> Here we only try to resolve the concurrent column group deletions issue. If 
> you have multiple threads or processes to delete column groups, they should 
> succeed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

2009-10-29 Thread Chao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PIG-1057:
---

Attachment: (was: patch_1057)

> [Zebra] Zebra does not support concurrent deletions of column groups now.
> -
>
> Key: PIG-1057
> URL: https://issues.apache.org/jira/browse/PIG-1057
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Chao Wang
>Assignee: Chao Wang
> Fix For: 0.6.0
>
> Attachments: patch_1057
>
>
> Zebra does not support concurrent deletions of column groups now.  As a 
> result, the TestDropColumnGroup testcase can sometimes fail due to this.
> In this testcase, multiple threads will be launched together, with each one 
> deleting one particular column group.  The following exception can be thrown 
> (with callstack):
> /*/
> ... 
> java.io.FileNotFoundException: File 
> /.../pig-trunk/build/contrib/zebra/test/data/DropCGTest/CG02 does not exist.
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:290)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:716)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:741)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:465)
>   at 
> org.apache.hadoop.zebra.io.BasicTable$SchemaFile.setCGDeletedFlags(BasicTable.java:1610)
>   at 
> org.apache.hadoop.zebra.io.BasicTable$SchemaFile.readSchemaFile(BasicTable.java:1593)
>   at 
> org.apache.hadoop.zebra.io.BasicTable$SchemaFile.(BasicTable.java:1416)
>   at 
> org.apache.hadoop.zebra.io.BasicTable.dropColumnGroup(BasicTable.java:133)
>   at 
> org.apache.hadoop.zebra.io.TestDropColumnGroup$DropThread.run(TestDropColumnGroup.java:772)
> ...
> /*/
> We plan to fix this in Zebra to support concurrent deletions of column 
> groups. The root cause is that a thread or process reads in some stale file 
> system information (e.g., it sees /CG0 first) and then can fail later on (it 
> tries to access /CG0, however /CG0 may be deleted by another thread or 
> process).  Therefore, we plan to adopt a retry logic to resolve this issue. 
> More detailed, we allow a dropping column group thread to retry n times when 
> doing its deleting job - n is the total number of column groups. 
> Note that here we do NOT try to resolve the more general concurrent column 
> group deletions + reads issue. If a process is reading some data that could 
> be deleted by another process, it can fail as we expect.
> Here we only try to resolve the concurrent column group deletions issue. If 
> you have multiple threads or processes to delete column groups, they should 
> succeed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

2009-10-29 Thread Chao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PIG-1057:
---

Attachment: patch_1057

> [Zebra] Zebra does not support concurrent deletions of column groups now.
> -
>
> Key: PIG-1057
> URL: https://issues.apache.org/jira/browse/PIG-1057
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Chao Wang
>Assignee: Chao Wang
> Fix For: 0.6.0
>
> Attachments: patch_1057, patch_1057
>
>
> Zebra does not support concurrent deletions of column groups now.  As a 
> result, the TestDropColumnGroup testcase can sometimes fail due to this.
> In this testcase, multiple threads will be launched together, with each one 
> deleting one particular column group.  The following exception can be thrown 
> (with callstack):
> /*/
> ... 
> java.io.FileNotFoundException: File 
> /.../pig-trunk/build/contrib/zebra/test/data/DropCGTest/CG02 does not exist.
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:290)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:716)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:741)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:465)
>   at 
> org.apache.hadoop.zebra.io.BasicTable$SchemaFile.setCGDeletedFlags(BasicTable.java:1610)
>   at 
> org.apache.hadoop.zebra.io.BasicTable$SchemaFile.readSchemaFile(BasicTable.java:1593)
>   at 
> org.apache.hadoop.zebra.io.BasicTable$SchemaFile.(BasicTable.java:1416)
>   at 
> org.apache.hadoop.zebra.io.BasicTable.dropColumnGroup(BasicTable.java:133)
>   at 
> org.apache.hadoop.zebra.io.TestDropColumnGroup$DropThread.run(TestDropColumnGroup.java:772)
> ...
> /*/
> We plan to fix this in Zebra to support concurrent deletions of column 
> groups. The root cause is that a thread or process reads in some stale file 
> system information (e.g., it sees /CG0 first) and then can fail later on (it 
> tries to access /CG0, however /CG0 may be deleted by another thread or 
> process).  Therefore, we plan to adopt a retry logic to resolve this issue. 
> More detailed, we allow a dropping column group thread to retry n times when 
> doing its deleting job - n is the total number of column groups. 
> Note that here we do NOT try to resolve the more general concurrent column 
> group deletions + reads issue. If a process is reading some data that could 
> be deleted by another process, it can fail as we expect.
> Here we only try to resolve the concurrent column group deletions issue. If 
> you have multiple threads or processes to delete column groups, they should 
> succeed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

2009-10-28 Thread Chao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PIG-1057:
---

Attachment: patch_1057

> [Zebra] Zebra does not support concurrent deletions of column groups now.
> -
>
> Key: PIG-1057
> URL: https://issues.apache.org/jira/browse/PIG-1057
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Chao Wang
>Assignee: Chao Wang
> Fix For: 0.6.0
>
> Attachments: patch_1057
>
>
> Zebra does not support concurrent deletions of column groups now.  As a 
> result, the TestDropColumnGroup testcase can sometimes fail due to this.
> In this testcase, multiple threads will be launched together, with each one 
> deleting one particular column group.  The following exception can be thrown 
> (with callstack):
> /*/
> ... 
> java.io.FileNotFoundException: File 
> /.../pig-trunk/build/contrib/zebra/test/data/DropCGTest/CG02 does not exist.
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:290)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:716)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:741)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:465)
>   at 
> org.apache.hadoop.zebra.io.BasicTable$SchemaFile.setCGDeletedFlags(BasicTable.java:1610)
>   at 
> org.apache.hadoop.zebra.io.BasicTable$SchemaFile.readSchemaFile(BasicTable.java:1593)
>   at 
> org.apache.hadoop.zebra.io.BasicTable$SchemaFile.(BasicTable.java:1416)
>   at 
> org.apache.hadoop.zebra.io.BasicTable.dropColumnGroup(BasicTable.java:133)
>   at 
> org.apache.hadoop.zebra.io.TestDropColumnGroup$DropThread.run(TestDropColumnGroup.java:772)
> ...
> /*/
> We plan to fix this in Zebra to support concurrent deletions of column 
> groups. The root cause is that a thread or process reads in some stale file 
> system information (e.g., it sees /CG0 first) and then can fail later on (it 
> tries to access /CG0, however /CG0 may be deleted by another thread or 
> process).  Therefore, we plan to adopt a retry logic to resolve this issue. 
> More detailed, we allow a dropping column group thread to retry n times when 
> doing its deleting job - n is the total number of column groups. 
> Note that here we do NOT try to resolve the more general concurrent column 
> group deletions + reads issue. If a process is reading some data that could 
> be deleted by another process, it can fail as we expect.
> Here we only try to resolve the concurrent column group deletions issue. If 
> you have multiple threads or processes to delete column groups, they should 
> succeed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.