Can you describe looking at the task list on spark dashboard around number
of mappers & reducers and time taken by the same.


Mayur Rustagi
Ph: +919632149971
h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com
https://twitter.com/mayur_rustagi



On Mon, Feb 3, 2014 at 12:39 AM, suman bharadwaj <[email protected]>wrote:

> Hi,
>
> I was exploring SPARK. And in the process, I was trying to search a column
> containing URL.
>
> Basically we are doing a contains operator on the column. This is taking
> around >3 min  to return the results. Is there any way to optimize this
> query ?
>
> .filter( line=>line.contains("someUrl"))
>
> I currently have a system in standalone mode with *8GB ram*.
> Everything is stored in memory in De-serialized format. The data size in
> memory( De-serialized ) is around *1 GB.*
>
>
> Any suggestions ?
>
> Thanks in advance.
>
> Regards,
> SB
>

Reply via email to