date:20170515

Hive handling of ingested data when source column changes size or new column added

2017-05-15 Thread Mich Talebzadeh

Assuming we are ingesting into Hive table from an RDBMS Oracle table. This is done through a daily mechanism. My conclusion is this. 1. The source column has moved from VARCHA2(50) to CARCHAR2(100). As I know this should not matter in Hive as every VARCHAR is stored as String in Hive.

Re: operation log is missing when using hive.execution.engine=mr

2017-05-15 Thread Jie Zhang

Hi, Peter, Exactly! By setting hive.async.log.enabled=false and restart hive server 2, the MR job progress is printed in the operation log. Thanks very much for your help! Jessica On Mon, May 15, 2017 at 10:56 AM, Peter Vary wrote: > Hi Jessica, > > Is it possible that you are effected by this

Re: operation log is missing when using hive.execution.engine=mr

2017-05-15 Thread Peter Vary

Hi Jessica, Is it possible that you are effected by this? https://issues.apache.org/jira/browse/HIVE-16061 Thanks, Peter 2017. máj. 15. 19:44 ezt írta ("Jie Zhang" ): Hi, My team just upgrade Hive from 0.14.0 to 2.1.1. The operation log is missing when running the query, no query progress i

operation log is missing when using hive.execution.engine=mr

2017-05-15 Thread Jie Zhang

Hi, My team just upgrade Hive from 0.14.0 to 2.1.1. The operation log is missing when running the query, no query progress is printed. The only log printed in operation log is "WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different e

Re: How can i merge multiple rows to one row in sparksql or hivesql?

2017-05-15 Thread Edward Capriolo

Here is a similar but not exact way I did something similar to what you did. I had two data files in different formats the different columns needed to be different features. I wanted to feed them into spark's: https://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Frequent_Pattern_Mining/The_FP-

Re: How can i merge multiple rows to one row in sparksql or hivesql?

2017-05-15 Thread goun na

I mentioned it opposite. collect_list generates duplicated results. 2017-05-16 0:50 GMT+09:00 goun na : > Hi, Jone Zhang > > 1. Hive UDF > You might need collect_set or collect_list (to eliminate duplication), but > make sure reduce its cardinality before applying UDFs as it can cause > problems

Re: How can i merge multiple rows to one row in sparksql or hivesql?

2017-05-15 Thread goun na

Hi, Jone Zhang 1. Hive UDF You might need collect_set or collect_list (to eliminate duplication), but make sure reduce its cardinality before applying UDFs as it can cause problems while handling 1 billion records. Union dataset 1,2,3 -> group by user_id1 -> collect_set (feature column) would work

How can i merge multiple rows to one row in sparksql or hivesql?

2017-05-15 Thread Jone Zhang

For example Data1(has 1 billion records) user_id1 feature1 user_id1 feature2 Data2(has 1 billion records) user_id1 feature3 Data3(has 1 billion records) user_id1 feature4 user_id1 feature5 ... user_id1 feature100 I want to get the result as follow user_id1 feature1 feature2 feature3 featu

Hive handling of ingested data when source column changes size or new column added

Re: operation log is missing when using hive.execution.engine=mr

Re: operation log is missing when using hive.execution.engine=mr

operation log is missing when using hive.execution.engine=mr

Re: How can i merge multiple rows to one row in sparksql or hivesql?

Re: How can i merge multiple rows to one row in sparksql or hivesql?

Re: How can i merge multiple rows to one row in sparksql or hivesql?

How can i merge multiple rows to one row in sparksql or hivesql?

8 matches

Site Navigation

Mail list logo

Footer information