Hi Travis,

Thanks a ton for this issue I know I will enjoy solving this (: So I
have some questions about this jira even though I think I understand
what the problem is.

- How do you think I should approach this? I mean if HCat can't send
the partitions' information through the configuration object, maybe we
should think on a different way of communicating this information
(thrift, or the database)?
- I was looking at HCatLoader but I am not sue if this would be a good
entry point for the modifications. Any suggestions?

Thanks again Travis!


Renato M.


2012/8/30 Travis Crawford <[email protected]>:
> You might be interested in https://issues.apache.org/jira/browse/HCATALOG-453
>
> The issue here is HCatalog queries the HiveMetaStore for info about
> the partitions to process, and stores that response in the job conf.
> When processing large numbers of partitions this bloats the job conf
> beyond what Hadoop will allow and the job fails.
>
> What's interesting about this issue is you'll learn about the main
> feature of HCatalog - translating db+table+partition_spec into a list
> of partitions, how HCat handles that internally, and how its
> communicated between the frontend & backend. The actual issue is
> straightforward, but I think spending the time to understand the
> problem will give a great overview of how HCat works.
>
> Thoughts?
>
> --travis
>
>
>
> On Thu, Aug 30, 2012 at 4:25 PM, Renato Marroquín Mogrovejo
> <[email protected]> wrote:
>> Travis,
>>
>> Thanks a lot for your response! My master's dissertation was about
>> using statistics to smarten up Apache Pig rule optimizer, so I would
>> love to help out with something related, but maybe you can suggest me
>> some interesting jiras (not complicated ones but maybe "noobies" ones)
>> I can start with (:
>> And yeah the labels thing is much better than creating a jura type for
>> noobies. Thanks again!
>>
>>
>> Renato M.
>>
>> 2012/8/30 Travis Crawford <[email protected]>:
>>> Hey Renato -
>>>
>>> Awesome! What in particular are you interested in starting out with?
>>> We can definitely find a starter project for you in that area.
>>>
>>> JIRA issues can have a variety of attributes; the attribute I started
>>> this thread about is the "issue type".
>>>
>>> JIRA also has "labels", which I think are a great place to indicate
>>> something would be good for noobies. For example, there could be an
>>> "issue type" of bug, with "label" noobie.
>>>
>>> Let us know what area you're interested in diving into and we can help
>>> come up with a starter project for ya.
>>>
>>> --travis
>>>
>>>
>>> On Thu, Aug 30, 2012 at 9:21 AM, Renato Marroquín Mogrovejo
>>> <[email protected]> wrote:
>>>> Hi all,
>>>>
>>>> I am new to HCatalog but I would like to get involved with the
>>>> project, and one thing that would totally help is to create an issue
>>>> type that indicates it is for "newbies". I saw that in Apache Pig they
>>>> have a special type of issue for this and with this they try to engage
>>>> more with the community. This would be awesome guys!
>>>> Thanks in advance!
>>>>
>>>>
>>>> Renato M.
>>>>
>>>> 2012/8/30 Travis Crawford <[email protected]>:
>>>>> Hey hcat gurus -
>>>>>
>>>>> Filing an issue just now I noticed the list of possible option types
>>>>> is pretty crazy long - any objection to requesting a simplification
>>>>> to:
>>>>>
>>>>> PROPOSED ISSUE TYPES:
>>>>>
>>>>> Bug - fixing unintended behavior
>>>>> New Feature - addition of brand-new functionality
>>>>> Improvement - making existing functionality better
>>>>>
>>>>> CURRENT ISSUE TYPES:
>>>>>
>>>>> Bug
>>>>> New Feature
>>>>> Improvement
>>>>> Test
>>>>> Wish
>>>>> Task
>>>>> New JIRA Project
>>>>> RTC
>>>>> TCK Challenge
>>>>> Question
>>>>> Temp
>>>>> Brainstorming
>>>>> Umbrella
>>>>> Epic
>>>>> Dependency upgrade
>>>>> Suitable Name Search
>>>>>
>>>>> If this sounds good I'll ping the infra folks and try to make this happen.
>>>>>
>>>>> --travis

Reply via email to