You might be interested in https://issues.apache.org/jira/browse/HCATALOG-453

The issue here is HCatalog queries the HiveMetaStore for info about
the partitions to process, and stores that response in the job conf.
When processing large numbers of partitions this bloats the job conf
beyond what Hadoop will allow and the job fails.

What's interesting about this issue is you'll learn about the main
feature of HCatalog - translating db+table+partition_spec into a list
of partitions, how HCat handles that internally, and how its
communicated between the frontend & backend. The actual issue is
straightforward, but I think spending the time to understand the
problem will give a great overview of how HCat works.

Thoughts?

--travis



On Thu, Aug 30, 2012 at 4:25 PM, Renato Marroquín Mogrovejo
<[email protected]> wrote:
> Travis,
>
> Thanks a lot for your response! My master's dissertation was about
> using statistics to smarten up Apache Pig rule optimizer, so I would
> love to help out with something related, but maybe you can suggest me
> some interesting jiras (not complicated ones but maybe "noobies" ones)
> I can start with (:
> And yeah the labels thing is much better than creating a jura type for
> noobies. Thanks again!
>
>
> Renato M.
>
> 2012/8/30 Travis Crawford <[email protected]>:
>> Hey Renato -
>>
>> Awesome! What in particular are you interested in starting out with?
>> We can definitely find a starter project for you in that area.
>>
>> JIRA issues can have a variety of attributes; the attribute I started
>> this thread about is the "issue type".
>>
>> JIRA also has "labels", which I think are a great place to indicate
>> something would be good for noobies. For example, there could be an
>> "issue type" of bug, with "label" noobie.
>>
>> Let us know what area you're interested in diving into and we can help
>> come up with a starter project for ya.
>>
>> --travis
>>
>>
>> On Thu, Aug 30, 2012 at 9:21 AM, Renato Marroquín Mogrovejo
>> <[email protected]> wrote:
>>> Hi all,
>>>
>>> I am new to HCatalog but I would like to get involved with the
>>> project, and one thing that would totally help is to create an issue
>>> type that indicates it is for "newbies". I saw that in Apache Pig they
>>> have a special type of issue for this and with this they try to engage
>>> more with the community. This would be awesome guys!
>>> Thanks in advance!
>>>
>>>
>>> Renato M.
>>>
>>> 2012/8/30 Travis Crawford <[email protected]>:
>>>> Hey hcat gurus -
>>>>
>>>> Filing an issue just now I noticed the list of possible option types
>>>> is pretty crazy long - any objection to requesting a simplification
>>>> to:
>>>>
>>>> PROPOSED ISSUE TYPES:
>>>>
>>>> Bug - fixing unintended behavior
>>>> New Feature - addition of brand-new functionality
>>>> Improvement - making existing functionality better
>>>>
>>>> CURRENT ISSUE TYPES:
>>>>
>>>> Bug
>>>> New Feature
>>>> Improvement
>>>> Test
>>>> Wish
>>>> Task
>>>> New JIRA Project
>>>> RTC
>>>> TCK Challenge
>>>> Question
>>>> Temp
>>>> Brainstorming
>>>> Umbrella
>>>> Epic
>>>> Dependency upgrade
>>>> Suitable Name Search
>>>>
>>>> If this sounds good I'll ping the infra folks and try to make this happen.
>>>>
>>>> --travis

Reply via email to