Thanks for your feedback. It would be great to have the design open as a
github issue. Can people leave comments there? It would be nice to have
some conversations over implementation details before we actually do it. I
can take the lead on its development and start checking in some working
code.

Thanks,
Dev


On Fri, May 10, 2013 at 6:20 AM, Lucas Meneghel Rodrigues 
<[email protected]>wrote:

> On 09/05/13 05:20 PM, Dev Priya wrote:
>
>> Hi,
>>
>> I have been working on a persistent log storage solution for Autotest
>> and want to discuss my thoughts with you, seek your advice and
>> investigate if prior work/solution exist on this front. As of now
>> Autotest in its default config stores the logs locally on the results
>> server. We don't get redundancy as well as very large storage capacity
>> in this configuration. To tackle this issue I am thinking of
>> implementing a variant of ResultsArchiver that archives the log files
>> and stores them on HDFS.
>>
>
> That's a great idea. I have some comments to make below.
>
>
>  The proposed changes are like this -
>>
>> 1. Config file will dictate whether to use local storage or HDFS.
>> 2. All HDFS related configs will be in the global config file.
>> 3. ResultsArchiver's HDFS implementation can either use python libraries
>> or wrap command line tools to push a file on HDFS. I am even planning to
>> explore HttpFS for Hadoop.
>> 4. For reading the files, currently Apache file handler handles the file
>> rendering. We can use HttpFS for accessing the files directly from HDFS
>> and this will need some alteration to the file urls. I think this can be
>> achieved by some rewrite rules.
>> 5. Another solution which will be better performance-wise but harder to
>> implement is to cache the files locally and then deliver them through
>> Apache file handler as we are doing now. The details of this
>> implementation are yet to be sorted out, again your feedback will be
>> valuable here.
>>
>> Has this storage problem's solution been attempted in the past
>>
>
> Not that I'm aware. There's the drone architecture in autotest that allows
> to spread the load of autoserv processes across several machines, that we
> call drones, but no special treatment is given to log files.
>
>
>  or do we
>> have any existing solution inside Autotest already that I might have
>> missed? If not, then does my proposed plan look good and will it be
>> something we would like to see in Autotest?
>>
>
> Yes, I definitely want to see it in Autotest, as we sometimes have trouble
> with our internal test grid logs. One thing that I was thinking is that
> GlusterFS might be an interesting option here. It even has a drop in
> compatibility library to make GlusterFS to replace HDFS, so that's
> something interesting to explore.
>
> If you feel like it, we could put the design open as a github issue or
> something, so it can be tracked, and people could help with tasks.
>
> Cheers,
>
> Lucas
>
>
>
_______________________________________________
Autotest-kernel mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/autotest-kernel

Reply via email to