Concurreny Model for Hive
-------------------------

                 Key: HIVE-1293
                 URL: https://issues.apache.org/jira/browse/HIVE-1293
             Project: Hadoop Hive
          Issue Type: New Feature
          Components: Query Processor
            Reporter: Namit Jain


Concurrency model for Hive:

Currently, hive does not provide a good concurrency model. The only guanrantee 
provided in case of concurrent readers and writers is that
reader will not see partial data from the old version (before the write) and 
partial data from the new version (after the write).
This has come across as a big problem, specially for background processes 
performing maintenance operations.

The following possible solutions come to mind.

1. Locks: Acquire read/write locks - they can be acquired at the beginning of 
the query or the write locks can be delayed till move
task (when the directory is actually moved). Care needs to be taken for 
deadlocks.

2. Versioning: The writer can create a new version if the current version is 
being read. Note that, it is not equivalent to snapshots,
the old version can only be accessed by the current readers, and will be 
deleted when all of them have finished.


Comments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to