Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Hive/ViewDev" page has been changed by JohnSichi.
http://wiki.apache.org/hadoop/Hive/ViewDev?action=diff&rev1=10&rev2=11

--------------------------------------------------

  
  '''Update 30-Dec-2009''':  Based on a design review meeting, we're going to 
go with the flat model.  Prasad pointed out that in the future, for 
materialized views, we may need the view definition to be tracked at the 
partition level as well, so that when we change the view definition, we don't 
have to discard existing materialized partitions if the new view result can be 
derived from the old one.  So it may make sense to add the view definition as a 
new attribute of StorageDescriptor (since that is already present at both table 
and partition level).
  
+ '''Update 20-Jan-2010''':  After further discussion with Prasad, we decided 
to put the view definition on the table object instead; for details, see JIRA.
+ 
  == Dependency Tracking ==
  
  It's necessary to track dependencies from a view to objects it references in 
the metastore:
@@ -97, +99 @@

  
  However, if later we want to introduce persistent functions, or track column 
dependencies, this model will be insufficient, and we may need to introduce 
inheritance, with a DependencyParticipant base class from which tables, 
columns, functions etc all derive.  (Again, need to verify that JDO inheritance 
will actually support what we want here.)
  
- '''Update 30-Dec-2009''':  Based on a design review meeting, we'll start with 
the bare-minimum MySQL approach (with no metastore support for dependency 
tracking), then if time allows, add dependency analysis and storage, followed 
by CASCADE support.
+ '''Update 30-Dec-2009''':  Based on a design review meeting, we'll start with 
the bare-minimum MySQL approach (with no metastore support for dependency 
tracking), then if time allows, add dependency analysis and storage, followed 
by CASCADE support.  See HIVE-1073 and HIVE-1074.
  
  == Dependency Invalidation ==
  
@@ -108, +110 @@

  
  Note that besides table modifications, other operations such as CREATE OR 
REPLACE VIEW have similar issues (since views can reference other views).  The 
lenient approach provides a reasonable solution for the related issue of 
external tables whose schemas may be dynamic (not sure if we currently support 
this).
  
- '''Update 30-Dec-2009''':  Based on a design review meeting, we'll start with 
the lenient approach, without any support for marking objects invalid in the 
metastore, then if time allows, follow up with strict support and possibly 
metastore support for tracking object validity.
+ '''Update 30-Dec-2009''':  Based on a design review meeting, we'll start with 
the lenient approach, without any support for marking objects invalid in the 
metastore, then if time allows, follow up with strict support and possibly 
metastore support for tracking object validity.  See HIVE-1077.
  
  == View Modification ==
  
@@ -119, +121 @@

  
  Note that supporting view modification requires detection of cyclic view 
definitions, which should be invalid.  Whether this detection is carried out at 
the time of view modification versus reference is dependent on the strict 
versus lenient approaches to dependency invalidation described above.
  
- '''Update 30-Dec-2009''':  Based on a design review meeting, we'll start with 
an Oracle-style ALTER VIEW v RECOMPILE, which can be used to revalidate a view 
definition, as well as to re-expand the original definition for clauses such as 
select *.  Then if time allows, we'll follow up with CREATE OR REPLACE VIEW 
support.  (The latter is less important since we're going with the lenient 
invalidation model, making DROP and re-CREATE possible without having to deal 
with downstream dependencies.)
+ '''Update 30-Dec-2009''':  Based on a design review meeting, we'll start with 
an Oracle-style ALTER VIEW v RECOMPILE, which can be used to revalidate a view 
definition, as well as to re-expand the original definition for clauses such as 
select *.  Then if time allows, we'll follow up with CREATE OR REPLACE VIEW 
support.  (The latter is less important since we're going with the lenient 
invalidation model, making DROP and re-CREATE possible without having to deal 
with downstream dependencies.)  See HIVE-1077 and HIVE-1078.
  
  == Fast Path Execution ==
  
@@ -135, +137 @@

  
  == Underlying Partition Dependencies ==
  
- '''Update 30-Dec-2009''':  Prasad pointed out that even without supporting 
materialized views, it may be necessary to provide users with metadata about 
data dependencies between views and underlying table partitions so that users 
can avoid seeing inconsistent results during the window when not all partitions 
have been refreshed with the latest data.  One option is to attempt to derive 
this information automatically (using an overconservative guess in cases where 
the dependency analysis can't be made smart enough); another is to allow view 
creators to declare the dependency rules in some fashion as part of the view 
definition.  Based on a design review meeting, we will probably go with the 
automatic analysis approach once dependency tracking is implemented.  The 
analysis will be performed on-demand, perhaps as part of describing the view or 
submitting a query job against it.  Until this becomes available, users may be 
able to do their own analysis either via empirical lineage tools or via 
view->table dependency tracking metadata once it is implemented.
+ '''Update 30-Dec-2009''':  Prasad pointed out that even without supporting 
materialized views, it may be necessary to provide users with metadata about 
data dependencies between views and underlying table partitions so that users 
can avoid seeing inconsistent results during the window when not all partitions 
have been refreshed with the latest data.  One option is to attempt to derive 
this information automatically (using an overconservative guess in cases where 
the dependency analysis can't be made smart enough); another is to allow view 
creators to declare the dependency rules in some fashion as part of the view 
definition.  Based on a design review meeting, we will probably go with the 
automatic analysis approach once dependency tracking is implemented.  The 
analysis will be performed on-demand, perhaps as part of describing the view or 
submitting a query job against it.  Until this becomes available, users may be 
able to do their own analysis either via empirical lineage tools or via 
view->table dependency tracking metadata once it is implemented.  See HIVE-1079.
  

Reply via email to