Actually this is what Hive originally did -- it used to trust partitions it discovered via HDFS -- this blind trust could be leveraged for just what you are requesting as partions do follow a simple directory scheme (and there is precedent for such out-of-band data loading). However, this blind trust became incompatible with extended feature set of external tables and per-partition schemas introduced earlier this year. The re-enabling of this behavior based on configuration is currently tracked as https://issues.apache.org/jira/browse/HIVE-493 'automatically infer existing partitions of table from HDFS files'.
On Tue, Aug 11, 2009 at 11:15 AM, Chris Goffinet <[email protected]> wrote: > Hi > > I was wondering if anyone has thought about the possibility of having > dynamic partitioning in Hive? Right now you typically use LOAD DATA or ALTER > TABLE to add new partitions. It would be great for applications like Scribe > that can load data into HDFS, could just place the data into the correct > folder structure for your partitions on HDFS. Has anyone investigated this? > What is everyone else doing in regards to things like this? It seems a > little error prone to have a cron job run everyday adding new partitions. It > might not even be possible to do dynamic partitioning since its meta data > read. But I'd love to hear thoughts? > > -Chris >
