Dear Stavros,

Thank you for your reply. I understand the data structures required during
the search. The 6GB is only the size of the actual index ( the directory of
indexes). The whole data including the segments accounts for 30GB.

Best

Hailong

On Fri, Oct 19, 2012 at 9:03 AM, Volos Stavros <[email protected]>wrote:

> Dear Hailong,
>
> There are two components that are used when performing a query against the
> index serving node:
> (a) the actual index (under indexes)
> (b) segments (under segments)
>
> What exactly is 6GB? Are you including the segments as well?
>
> Regards,
> -Stavros.
>
>
> ________________________________________
> From: Hailong Yang [[email protected]]
> Sent: Wednesday, October 17, 2012 4:51 AM
> To: [email protected]
> Cc: Lingjia Tang; Jason Mars
> Subject: How to fit the index into the memory for the web search benchmark
>
> Hi CloudSuite,
>
> I am experimenting with the web search benchmark. However, I am wondering
> how to fit the index into the memory in order to avoid unnecessary disk
> access. I have a 6GB index crawled from wikipedia and the RAM is 16GB.
> During the workload execution, I noticed there were periodical 2% I/O
> utilization increase and the memory used by nutch server was always less
> than 500MB. So I guess the whole index is not brought into the memory by
> default before serving the search queries, right? Could you tell me how to
> do that exactly as you did in the clearing cloud paper. Thanks!
>
>
> Best
>
> Hailong
>

Reply via email to