[ 
https://issues.apache.org/jira/browse/HBASE-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400884#comment-13400884
 ] 

Aditya Kishore commented on HBASE-6158:
---------------------------------------

1. Would like to clarify that this change affects CF names and not user tables 
names.

{code}
\-<HBase Root>
 +-\<user_table>
   +-\<region_name>
     +-\merges

and

\-<HBase Root>
 +-\<user_table>
   +-\<region_name>
     +-\splits
{code}

2. Prior to this fix, it was impossible for any table to have a column family 
with name "merges" or "splits" *ON DISK* since these folders get deleted 
whenever a region is opened. If a table has a defined schema with "merges" or 
"splits" as CF name, it will continue to accept puts until they are to be 
flushed to disk at which point the flush fails with the following exception

ERROR: org.apache.hadoop.ipc.RemoteException: 
org.apache.hadoop.hbase.DroppedSnapshotException: region: <region_name>
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:999)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:904)
at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:856)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.flushRegion(HRegionServer.java:2192)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
Caused by: java.io.FileNotFoundException: File does not exist: 
hdfs://<server>:<port>/<path_to_region>/merges
at org.apache.hadoop.hdfs.DistributedFileSystem(DistributedFileSystem.java:519)
at org.apache.hadoop.hdfs.DistributedFileSystem(DistributedFileSystem.java:504)
at 
org.apache.hadoop.hbase.regionserver.StoreFile.getUniqueFile(StoreFile.java:580)
at org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:493)
at org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:448)
at org.apache.hadoop.hbase.regionserver.Store.access$100(Store.java:81)
at 
org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.flushCache(Store.java:1519)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:977)
... 9 more

3. You are right about the "merges"/MERGEDIR not being created anymore. Looks 
like this is a leftover code from original region merge code which was 
[modified|http://svn.apache.org/viewvc/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java?r1=638612&r2=639775&pathrev=1342855&diff_format=h]
 as part of HBASE-483. However the delete code still exist. I think the 
constant along with the code which deletes the MERGEDIR folder can be safely 
removed.

4. Which means that the only folder that we may even need to consider during 
upgrade is "splits". However, it is a transient folder which should exist only 
until the parent region is clean up. And this does get cleaned up when the 
corresponding region is opened.
                
> Data loss if the words 'merges' or 'splits' are used as Column Family name
> --------------------------------------------------------------------------
>
>                 Key: HBASE-6158
>                 URL: https://issues.apache.org/jira/browse/HBASE-6158
>             Project: HBase
>          Issue Type: Bug
>          Components: master, regionserver
>    Affects Versions: 0.94.0
>            Reporter: Aditya Kishore
>            Assignee: Aditya Kishore
>             Fix For: 0.92.2, 0.94.1
>
>         Attachments: HBASE-6158_94.patch, HBASE-6158_trunk.patch
>
>
> If a table is creates with either 'merges' or 'splits' as one of the Column 
> Family name it can never be flushed to the disk even though the table 
> creation (and data population) succeeds.
> The reason for this is that these two are used as temporary directory names 
> inside the region folder or merge and splits respectively and hence conflicts 
> with the directories created for CF with same name.
> A simple fix would be to uses ".merges' and ".splits" as the working folder 
> (patch attached). This will also be consistent with other work folder names. 
> An alternate fix would be to declare these words (and other similar) as 
> reserve words and throw exception when they are used. However, I do find the 
> alternate approach as unnecessarily restrictive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to