Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Hive/ViewDev" page has been changed by JohnSichi.
http://wiki.apache.org/hadoop/Hive/ViewDev?action=diff&rev1=6&rev2=7

--------------------------------------------------

  
   * add queryable view support at the SQL language level (specifics of the 
scoping are under discussion in the Issues section below)
    * updatable views will not be supported
-  * make sure views show up anywhere tables can currently be 
enumerated/searched
+  * make sure views and their definitions show up anywhere tables can 
currently be enumerated/searched/described
   * where relevant, provide additional metadata to allow views to be 
distinguished from tables
  
  Beyond this, we may want to
@@ -35, +35 @@

  There are similar issues with other kinds of references in the view 
definition; for example, if a table or function name can be qualified, then the 
reference should be bound at the time the view is created.
  
  Implementing this typically requires expanding the view definition into an 
explicit form rather than storing the original view definition text directly.  
Doing this could require adding "unparse" support to the AST model (to be 
applied after object name resolution takes place), something which is not 
currently present (and which is also useful to have available in general).
+ 
+ However, storing both the expanded form and the original view definition text 
as well can also be useful for both DESCRIBE readability as well as 
functionality (see later section on ALTER VIEW v RECOMPILE).
  
  == Metastore Modeling ==
  
@@ -81, +83 @@

  
  However, if later we want to introduce persistent functions, or track column 
dependencies, this model will be insufficient, and we may need to introduce 
inheritance, with a DependencyParticipant base class from which tables, 
columns, functions etc all derive.  (Again, need to verify that JDO inheritance 
will actually support what we want here.)
  
+ '''Update 30-Dec-2009''':  Based on a design review meeting, we'll start with 
the bare-minimum MySQL approach (with no metastore support for dependency 
tracking), then if time allows, add dependency analysis and storage, followed 
by CASCADE support.
+ 
  == Dependency Invalidation ==
  
  What happens when an object is modified underneath a view?  For example, 
suppose a view references a table's column, and then ALTER TABLE is used to 
drop or replace that column.  Note that if the column's datatype changes, the 
view definition may remain meaningful, but the view's schema may need to be 
updated to match.  Here are two possible options:
@@ -89, +93 @@

   * '''Lenient''': allow the update to proceed (and maybe warn the user of the 
impact), potentially leaving the view in an invalid state.  Later, when an 
invalid view definition is referenced, throw a validation exception for the 
referencing query.  This is the approach taken by MySQL.  In the case of 
datatype changes, derived column datatypes already stored in metastore for 
referencing views would become stale until those views were recreated.
  
  Note that besides table modifications, other operations such as CREATE OR 
REPLACE VIEW have similar issues (since views can reference other views).  The 
lenient approach provides a reasonable solution for the related issue of 
external tables whose schemas may be dynamic (not sure if we currently support 
this).
+ 
+ '''Update 30-Dec-2009''':  Based on a design review meeting, we'll start with 
the lenient approach, without any support for marking objects invalid in the 
metastore, then follow up with strict support and possibly metastore support 
for tracking object validity.
  
  == View Modification ==
  
@@ -99, +105 @@

  
  Note that supporting view modification requires detection of cyclic view 
definitions, which should be invalid.  Whether this detection is carried out at 
the time of view modification versus reference is dependent on the strict 
versus lenient approaches to dependency invalidation described above.
  
+ '''Update 30-Dec-2009''':  Based on a design review meeting, we'll start with 
an Oracle-style ALTER VIEW v RECOMPILE, which can be used to revalidate a view 
definition, as well as to re-expand the original definition for clauses such as 
select *.  Then if time allows, we'll follow up with CREATE OR REPLACE VIEW 
support.  (The latter is less important since we're going with the lenient 
invalidation model, making DROP and re-CREATE possible without having to deal 
with downstream dependencies.)
+ 
  == Fast Path Execution ==
  
  For '''select * from t''', hive supports fast-path execution (skipping 
Map/Reduce).  Is it important for this to work for '''select * from v''' as 
well?
+ 
+ '''Update 30-Dec-2009''':  Based on feedback in JIRA, we'll leave this as 
dependent on getting the fast-path working for the underlying filters and 
projections.
  
  == ORDER BY and LIMIT in view definition ==
  
  SQL:200n prohibits ORDER BY in a view definition, since a view is supposed to 
be a virtual (unordered) table, not a query alias.  However, many DBMS's ignore 
this rule; for example, MySQL allows ORDER BY, but ignores it in the case where 
it is superceded by an ORDER BY in the query.  Should we prevent ORDER BY?  
This question also applies to the LIMIT clause.
  
+ '''Update 30-Dec-2009''':  Based on feedback in JIRA, ORDER BY is important 
as forward-looking to materialized views.  LIMIT may be less important, but we 
should probably support it too for consistency.
+ 

Reply via email to