Re: How to split log data into different files according to severity

Hao Wang Sun, 14 Jun 2015 16:33:06 -0700

Thanks for the link. I’m still running 1.3.1 but will give it a try :)

Hao


> On Jun 13, 2015, at 9:38 AM, Will Briggs <wrbri...@gmail.com> wrote:
> 
> Check out this recent post by Cheng Liam regarding dynamic partitioning in 
> Spark 1.4: https://www.mail-archive.com/user@spark.apache.org/msg30204.html 
> <https://www.mail-archive.com/user@spark.apache.org/msg30204.html>
> 
> On June 13, 2015, at 5:41 AM, Hao Wang <bill...@gmail.com> wrote:
> 
> 
> Hi,
> 
> I have a bunch of large log files on Hadoop. Each line contains a log and its 
> severity. Is there a way that I can use Spark to split the entire data set 
> into different files on Hadoop according the severity field? Thanks. Below is 
> an example of the input and output.
> 
> Input:
> [ERROR] log1
> [INFO] log2
> [ERROR] log3
> [INFO] log4
> 
> Output:
> error_file
> [ERROR] log1
> [ERROR] log3
> 
> info_file
> [INFO] log2
> [INFO] log4
> 
> 
> Best,
> Hao Wang

Re: How to split log data into different files according to severity

Reply via email to