Hive is more error-tolerant; while Kylin need the data be washed and clean, so to ensure the accuracy at a high aggregation level.
I don't have a good idea either; you may need check the up-stream system to see whether there was some problem. 2016-05-06 23:35 GMT+08:00 胡志华(万里通科技及数据中心商务智能团队数据分析组) < [email protected]>: > I think it's difficulty to check, the amount of data is huge. > > Could you give me some suggestion? > > -----邮件原件----- > 发件人: ShaoFeng Shi [mailto:[email protected]] > 发送时间: 2016年5月6日 23:33 > 收件人: [email protected] > 主题: Re: 答复: problem happened at step "build base cuboid data" > > I mean the data in hive table; if there is some dirty data (e.g, it was > declared as decimal, but actually be a string), it may cause the cube build > failed. > > 2016-05-06 23:29 GMT+08:00 胡志华(万里通科技及数据中心商务智能团队数据分析组) < > [email protected]>: > > > You mean data type, let me give you description, > > > > > desc partner_txn_sub_order_ft0_s; > > OK > > sub_txn_id string > > txn_id string > > order_id string > > sub_order_id string > > pp_order_id string > > pp_sub_order_id string > > process_dt string > > slt_doc_id string > > doc_id string > > sub_doc_id string > > gain_pay_ind string > > pathway_ind string > > txn_type_ind string > > txn_sub_type_ind string > > product_id string > > product_name string > > rule_id string > > rule_name string > > slt_partner_id string > > slt_partner_desc string > > pay_cash decimal(22,7) > > pay_points decimal(22,7) > > gain_points decimal(22,7) > > discount decimal(22,7) > > age_level_ind string > > gender_ind string > > phone_province_ind string > > phone_city_ind string > > point_current_level_ind string > > binding_d string > > binding_m string > > is_email_verified int > > is_mobile_verified int > > is_app int > > partner_gain_pt_level_ind string > > wlt_txn_level_ind string > > brand_point_no string > > pathway_desc string > > is_activity int > > pt_log_d string > > partner_id string > > > > # Partition Information > > # col_name data_type comment > > > > pt_log_d string > > partner_id string > > Time taken: 0.124 seconds, Fetched: 47 row(s) > > hive> > > > > -----邮件原件----- > > 发件人: ShaoFeng Shi [mailto:[email protected]] > > 发送时间: 2016年5月6日 23:27 > > 收件人: [email protected] > > 主题: Re: problem happened at step "build base cuboid data" > > > > seems some data couldn't be parsed as a BigDecimal. You may need check > > the data type in source table. > > > > 2016-05-06 20:34 GMT+08:00 胡志华(万里通科技及数据中心商务智能团队数据分析组) < > > [email protected]>: > > > > > Hi all > > > > > > I am encountering a problem at step "build base cuboid > > > data", mapreduce log as below > > > > > > And I googled it, but found nothing useful, so who can help me ? > > > > > > > > > Error: java.lang.NumberFormatException at > > > java.math.BigDecimal.<init>(BigDecimal.java:470) at > > > java.math.BigDecimal.<init>(BigDecimal.java:739) at > > > org.apache.kylin.measure.basic.BigDecimalIngester.valueOf(BigDecimal > > > In > > > gester.java:39) > > > at > > > org.apache.kylin.measure.basic.BigDecimalIngester.valueOf(BigDecimal > > > In > > > gester.java:29) > > > at > > > org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase.buildValueOf(B > > > as > > > eCuboidMapperBase.java:189) > > > at > > > org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase.buildValue(Bas > > > eC > > > uboidMapperBase.java:159) > > > at > > > org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase.outputKV(BaseC > > > ub > > > oidMapperBase.java:206) > > > at > > > org.apache.kylin.engine.mr.steps.HiveToBaseCuboidMapper.map(HiveToBa > > > se > > > CuboidMapper.java:53) at > > > org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at > > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at > > > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at > > > java.security.AccessController.doPrivileged(Native Method) at > > > javax.security.auth.Subject.doAs(Subject.java:415) at > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInform > > > at > > > ion.java:1614) at > > > org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) > > > > > > > > > ******************************************************************** > > > ** > > > ********************************************************** > > > The information in this email is confidential and may be legally > > > privileged. If you have received this email in error or are not the > > > intended recipient, please immediately notify the sender and delete > > > this message from your computer. Any use, distribution, or copying > > > of this email other than by the intended recipient is strictly > > > prohibited. All messages sent to and from us may be monitored to > > > ensure compliance with internal policies and to protect our business. > > > Emails are not secure and cannot be guaranteed to be error free as > > > they can be intercepted, amended, lost or destroyed, or contain > > > viruses. Anyone who communicates with us by email is taken to accept > > these risks. > > > > > > 收发邮件者请注意: > > > 本邮件含保密信息,若误收本邮件,请务必通知发送人并直接删去,不得使用、传播或复制本邮件。 > > > 进出邮件均受到本公司合规监控。邮件可能发生被截留、被修改、丢失、被破坏或包含计算机病毒等不安全情况。 > > > > > > ******************************************************************** > > > ** > > > ********************************************************** > > > > > > > > > > > -- > > Best regards, > > > > Shaofeng Shi > > > > > > ********************************************************************** > > ********************************************************** > > The information in this email is confidential and may be legally > > privileged. If you have received this email in error or are not the > > intended recipient, please immediately notify the sender and delete > > this message from your computer. Any use, distribution, or copying of > > this email other than by the intended recipient is strictly > > prohibited. All messages sent to and from us may be monitored to > > ensure compliance with internal policies and to protect our business. > > Emails are not secure and cannot be guaranteed to be error free as > > they can be intercepted, amended, lost or destroyed, or contain > > viruses. Anyone who communicates with us by email is taken to accept > these risks. > > > > 收发邮件者请注意: > > 本邮件含保密信息,若误收本邮件,请务必通知发送人并直接删去,不得使用、传播或复制本邮件。 > > 进出邮件均受到本公司合规监控。邮件可能发生被截留、被修改、丢失、被破坏或包含计算机病毒等不安全情况。 > > > > ********************************************************************** > > ********************************************************** > > > > > > -- > Best regards, > > Shaofeng Shi > > > ******************************************************************************************************************************** > The information in this email is confidential and may be legally > privileged. If you have received this email in error or are not the > intended recipient, please immediately notify the sender and delete this > message from your computer. Any use, distribution, or copying of this email > other than by the intended recipient is strictly prohibited. All messages > sent to and from us may be monitored to ensure compliance with internal > policies and to protect our business. > Emails are not secure and cannot be guaranteed to be error free as they > can be intercepted, amended, lost or destroyed, or contain viruses. Anyone > who communicates with us by email is taken to accept these risks. > > 收发邮件者请注意: > 本邮件含保密信息,若误收本邮件,请务必通知发送人并直接删去,不得使用、传播或复制本邮件。 > 进出邮件均受到本公司合规监控。邮件可能发生被截留、被修改、丢失、被破坏或包含计算机病毒等不安全情况。 > > ******************************************************************************************************************************** > -- Best regards, Shaofeng Shi
