抱歉,我的英语水平确实很糟糕。 我写了一个MapReduce程序,这个程序从kudu表中读取数据处理之后写到另一个kudu表中。 刚开始的时候一切正常。过了一段时间之后我发现程序不能获得所有的数据,但用impala-shell却可以查出。 我把其中的一部分数据导入到一个新的表中。再从新的表中读取就能获得全部的数据。 我猜测是表的问题,所以我进行了一系列的测试。 我想我大概找到了问题,我一直在往之前的那个表写数据,同时也有删除历史数据的操作。并且这个表只有HASH分区。我想是这些原因导致的问题。 现在我重新创建了同时拥有HASH和RANGE分区的表,目前看来,一切正常。 请问,我的分析对吗?频繁的对表进行插入,删除,查询操作会不会导致一些像这样的问题? Salt
2017-12-19 7:47 GMT+08:00 Hao Hao <hao....@cloudera.com>: > Hi Salt, > > 你可以用中文回复我,但是你之前的描述并不是很清楚。 > 除了之前那个问题, 你能把你的Impala query, 期待的值,实际得到的值,也发过来吗? > 还有mike建议你用 base64 predicate-serialization functions, 你有试一下吗? > > Best, > Hao > > On Mon, Dec 18, 2017 at 2:42 PM, Hao Hao <hao....@cloudera.com> wrote: > >> Hi Salt, >> >> Can you reproduce the issue with your code, if so can you post it? Thanks! >> >> Best, >> Hao >> >> On Mon, Dec 18, 2017 at 12:47 PM, Mike Percy <mpe...@apache.org> wrote: >> >>> Hi Hao, >>> Is there enough information here to help with this? >>> >>> Thanks, >>> Mike >>> >>> ---------- Forwarded message ---------- >>> From: zha...@broadtech.com.cn <zha...@broadtech.com.cn> >>> Date: Fri, Dec 15, 2017 at 2:10 AM >>> Subject: Re: Re: 关于Kudu Mapreduce 程序的问题 >>> To: Mike Percy <mpe...@apache.org> >>> >>> >>> 抱歉,我的英语水平确实很糟糕。 >>> 我写了一个MapReduce程序,这个程序从kudu表中读取数据处理之后写到另一个kudu表中。 >>> 刚开始的时候一切正常。过了一段时间之后我发现程序不能获得所有的数据,但用impala-shell却可以查出。 >>> 我把其中的一部分数据导入到一个新的表中。再从新的表中读取就能获得全部的数据。 >>> 我猜测是表的问题,所以我进行了一系列的测试。 >>> 我想我大概找到了问题,我一直在往之前的那个表写数据,同时也有删除历史数据的操作。并且这个表只有HASH分区。我想是这些原因导致的问题。 >>> 现在我重新创建了同时拥有HASH和RANGE分区的表,目前看来,一切正常。 >>> 请问,我的分析对吗?频繁的对表进行插入,删除,查询操作会不会导致一些像这样的问题? >>> Salt >>> >>> ------------------------------ >>> zha...@broadtech.com.cn >>> >>> >>> *发件人:* Mike Percy <mpe...@apache.org> >>> *发送时间:* 2017-12-15 16:43 >>> *收件人:* zha...@broadtech.com.cn >>> *主题:* Re: 关于Kudu Mapreduce 程序的问题 >>> You may want to post your code and explain the problem you are having. >>> >>> You could also try sending your message in both English and Chinese >>> language. I know we have some Chinese speakers on the mailing list so maybe >>> more people can help you with this problem if you also explain the issue in >>> Chinese. >>> >>> I say this because in your original email your problem was not easy to >>> understand. >>> >>> Mike >>> >>> On Dec 14, 2017, at 11:30 PM, "zha...@broadtech.com.cn" < >>> zha...@broadtech.com.cn> wrote: >>> >>> I have subscribed to the user@kudu.apache.org. What should I do next? >>> >>> Sorry for my mistake. >>> >>> Salt >>> >>> ------------------------------ >>> zha...@broadtech.com.cn >>> >>> >>> *发件人:* Mike Percy <mpe...@apache.org> >>> *发送时间:* 2017-12-15 10:42 >>> *收件人:* zha...@broadtech.com.cn >>> *主题:* Re: Re: 关于Kudu Mapreduce 程序的问题 >>> Are you a subscriber to the user@kudu.apache.org email list? When you >>> send a message to user@kudu.apache.org when people "reply" to your >>> email, instead of the reply going to your email address they will instead >>> go to user@kudu.apache.org. So, if you have not subscribed to the >>> mailing list, you will not see their replies. >>> >>> I am asking you about this because I replied to your email on the list. >>> I asked for more information but you did not respond on the mailing list >>> yet. >>> >>> Example: https://lists.apache.org/thread.html/56bcedcca1b5fd >>> 1009b928b1fff4b44accfefe34a28295d5a73e6ee8@%3Cuser.kudu.apache.org%3E >>> >>> Mike >>> >>> On Thu, Dec 14, 2017 at 5:46 PM, zha...@broadtech.com.cn < >>> zha...@broadtech.com.cn> wrote: >>> >>>> sorry, I don't know what are you talking about. >>>> >>>> ------------------------------ >>>> zha...@broadtech.com.cn >>>> >>>> >>>> *发件人:* Mike Percy <mpe...@apache.org> >>>> *发送时间:* 2017-12-15 03:28 >>>> *收件人:* zhaolr <zha...@broadtech.com.cn> >>>> *主题:* Re: 关于Kudu Mapreduce 程序的问题 >>>> Hi, by default, replies go to the list. Are you subscribed? >>>> >>>> >>>> On Wed, Dec 13, 2017 at 12:07 AM, Mike Percy <mpe...@apache.org> wrote: >>>> >>>>> Can you please post your code and explain the problem you're seeing? >>>>> >>>>> On Tue, Dec 12, 2017 at 11:00 PM, zha...@broadtech.com.cn < >>>>> zha...@broadtech.com.cn> wrote: >>>>> >>>>>> Hi, >>>>>> I have a MapReduce program that uses KuduTableInputFormat t >>>>>> o read data from a kudu table and write it to another kudu table. I read >>>>>> from some tables, >>>>>> it get all the data, but some tables can only get a small part >>>>>> of the data. >>>>>> I found some rules through the test: >>>>>> 1. Questionable tables are created a few months ago, no prob >>>>>> lem table are new >>>>>> 2. The problem table Through Imapla-shell can get all the data >>>>>> 3. Some of the data in the table is exported to a new table, >>>>>> can find out all the data >>>>>> 4. By kudu cluster ksck, all the tables are normal >>>>>> I want to know what caused this problem can help me? >>>>>> ------------------------------ >>>>>> zha...@broadtech.com.cn >>>>>> >>>>> >>>>> >>>> >>> >>> >> >