Hi,
Thanks for the info. I understand ELT (Extract, Load, Transform) is more
appropriate for big data compared to traditional ETL. What are the major
advantages of this in Big Data space.
Example. if I started using Sqoop to get data from traditional transactional
and Data Warehouse databases an
Adding to valuable points raised by Grant and Jörn, one can also add the
Total Cost of Ownership (TCO) of the application on Big Data compared to a
traditional warehouse like Oracle and Sybase IQ that utilize vertical
scaling in an SMP environment on expensive SAN. Although there a lot of
other abs
I think you should draw more the attention that Hive is just one component in
the ecosystem. You can have many more components, such as ELT, integrating
unstructured data, machine learning, streaming data etc. however usually
analysts are not aware about the technologies and it staff is not much
Yes, that is what I meant.
In practice, it is often not possible to reach a 100% perfectly denormalized
fact table (for instance, if you need type 1 dimension behavior).
PS:
We're using 'denormalize' a bit loosely here since we are comparing to a star
schema and not a 3NF schema. The proper te
Hi,
Your statement
“I read that this is due to something not being compiled against the correct
hadoop version.
my main question what is the binary/jar/file that can cause this?”
I believe this is the file in $HIVE_HOME/lib called
spark-assembly-1.3.1-hadoop2.4.0.jar which you need to b
Thank you sir.
Can you please describe a bit more detail your vision of "A fully denormalized
columnar store"? Are you referring to get rid of star schema altogether in Hive
and replace it with ORC tables?
Regards
On Friday, 18 December 2015, 21:13, Grant Overby (groverby)
wrote:
You
You forgot horizontal scaling.
A fully denormalized columnar store in Hive will out preform a star schema in
Oracle in every way imaginable at scale; however, if your data isn't big enough
then this is a moot point.
If your data fits in a traditional BI warehouse, and especially if it does so
Gurus,
Some analysts keep asking me the advantages of having Hive tables when the star
schema in Data Warehouse (DW) does the same.
For example if you have fact and dimensions table in DW and just import them
into Hive via a say SQOOP, what are we going to gain.
I keep telling them storage econo
Hi Marcin,
If the DDL update to the main table involves a new column, then at the moment
Hive does not support adding column. Yes the schema in metastore can change but
the file system will not allow you to add values to the new column.
1.Thus as discussed in “Adding a new column t
During spark-submit when running hive on spark I get:
Exception in thread "main" java.util.ServiceConfigurationError:
org.apache.hadoop.fs.FileSystem: Provider
org.apache.hadoop.hdfs.HftpFileSystem could not be instantiated
Caused by: java.lang.IllegalAccessError: tried to access method
org.apac
Hi Gabriel,
Thanks for responding, this helps. One more question, currently I'm able
to load only UTF-8 encode files only. I see from Hive 0.14, it supports
all encoding, but i couldn't load UTF-16 or UTF-32 or Big Indian
formats. Are this supported, Are there any serde available for these
UT
Could you create a JIRA with repro case?
Thanks,
Xuefu
On Thu, Dec 17, 2015 at 9:21 PM, Jone Zhang wrote:
> *My query is *
> set hive.execution.engine=spark;
> select
>
> t3.pcid,channel,version,ip,hour,app_id,app_name,app_apk,app_version,app_type,dwl_tool,dwl_status,err_type,dwl_store,dwl_maxs
Eugene/Susanth,
Thank you for pointing me in the direction of these features. I'll
investigate them further to see if I can put them to good use.
Cheers - Elliot.
On 17 December 2015 at 20:03, Sushanth Sowmyan wrote:
> Also, while I have not wiki-ized the documentation for the above, I
> have
Hi All,
We import our production database into hive on a schedule using sqoop.
Unfortunately, sqoop won't update the table schema in hive when the table
schema has changed in the source database.
Accordingly, to get updates to the table schema we drop the hive table
first.
Unfortunately, this ca
Hello all
I have a hive client (inside a Spark streaming process, if this is of any use)
reading from an external non-native table stored by an HBaseStorageHandler.
When I execute one client at a time it works fine, but when I start another
similar client I get the following error
org.apache.h
Hello all
I have a hive client (inside a Spark streaming process, if this is of any use)
reading from an external non-native table stored by an HBaseStorageHandler.
When I execute one client at a time it works fine, but when I start another
similar client I get the following error
org.apache.h
I think It is from Hive Bug about something related to metastore.
Here is the thing.
After I generated scale factor 300 named bigbench300 and bigbench100, which
already existed before,
I run "hive job with bigbench300". At first it was really fine.
Then I run hive job with bigbench100 again. It w
Hi
Can you retry with hive.optimize.sort.dynamic.partition set to true?
Thanks
Prasanth
On Dec 18, 2015, at 3:48 AM, Hemanth Meka
mailto:hemanth.m...@datametica.com>> wrote:
Hi All,
We have a source table and a target table. data in source table is text and
without partitions and target tabl
Hi All,
We have a source table and a target table. data in source table is text and
without partitions and target table is a orc table with 5 partition columns and
6000 records and 1400 partitions.
We are trying to insert overwrite the target table with the data in the source
table. This inser
19 matches
Mail list logo