Hello Fu Bin-zhang,
The error message is very weird because FileSplit is a class derived from
InputSplit,
and the conversion is legal. However, I've seen this message several times. The
error
is highly likely related to the location of the hadoop tmp directory. Could you
please compress and
se
t.ac.cn]
Sent: Thursday, April 12, 2012 3:27 PM
To: Djordje Jevdjic
Subject: Re: RE: A question about "data analytics"
Hello Djordje,
Thanks for your advice, the problem is really caused by the tmp directory.
I think the reason maybe that i didn't reformat the namenode after
Dear Hasan,
The file was removed from that link 5 days ago. Than you for pointing that out.
I corrected the link so you can try again.
Regarding the other error: you have to install and deploy Mahout completely and
without any errors.
Please tell me which version of Maven an which version of JD
Dear Jayneel,
The way the workload is set up corresponds to a typical use of the Hadoop
Map-Reduce framework, which means that each map task is a separate process. The
map task itself can be multithreaded though, but we use Mahout's version of the
classification algorithm, which is single-thre
Dear Marcos,
Thanks for your interest in CloudSuite and welcome to our mailing list.
Regarding the Analyitics benchmark, I can see that you made some small
mistakes.In your first example, you didn't execute the whole command:
$MAHOUT_HOME/bin/mahout wikipediaXMLSplitter -d
$MAHOUT_HOME/example
Dear Mandanna,
It seems that you are not using the correct version of Hadoop. Please use
version 0.20.2 as indicated, not 0.22.0.
The version you are currently using doesn't have "ProgramDriver" and several
other classes needed to run this benchmark (due to the changes in the API).
The easiest
Dear Kiran,
It seems that you are not using the correct version of Mahout. Please use
version 0.6 as indicated.
The easiest way to ensure that you use the correct version of the prerequisite
packages is to download the whole benchmark from the CloudSuite website.
I believe this will solve your p
Dear Jinchun,
The warning message that you get is irrelevant. The problem seems to be in
the amount of memory that is given to the map-reduce tasks. You need to
increase the heap size (e.g., run -Xmx 2048M) and make sure that you have
enough DRAM for the heap size you indicate. To change the hea
: Jinchun Kim [cien...@gmail.com]
Sent: Friday, March 22, 2013 3:04 PM
To: Djordje Jevdjic
Cc: cloudsuite@listes.epfl.ch
Subject: Re: Question about data analytic
Thanks Djordje :)
I was able to prepare the input data file and now I'm trying to create
category-based splits of
Wikipedia da
heap.
Regards,
Djordje
From: Jinchun Kim [cien...@gmail.com]
Sent: Monday, March 25, 2013 12:56 AM
To: Djordje Jevdjic
Cc: cloudsuite@listes.epfl.ch
Subject: Re: Question about data analytic
Thanks Djordje.
The heap size indicated in mapred-site.xml is set to -Xmx
Dear Tri,
Thanks for pointing this out. Please use the updated instructions from
the web (the scaling factor in the first command is also updated).
Regards,
Djordje
From: Tri M. Nguyen [t...@princeton.edu]
Sent: Monday, July 01, 2013 9:22 PM
To: cloudsu
Hello Binh,
The column requests tells you how many requests were served during the last
statistics interval (1s in your case, because of -T 1). The actual throughput
is the second column, rps (requests per second).
The command you ran is used to estimate the maximum throuhput (rps) you can
ac
Dear Mahmood,
CloudSuite 2.0 introduces two new benchmarks: DataCaching and Graph Analytics.
Regarding CloudSuite 1.0 benchmarks, we are currently upgrading the software
packages and updating the benchmarks. Once we are done, we will post the change
log
on the website.
Regards,
Djordje
__
Dear Yarong,
This error usually implies that you have misconfigured the simulator in the
wiring file.
In other words, your wiring.cpp file is not consistent with the config file.
Highly likely it is related to the number of memory controllers (unless you
added your own components that you did
Hi Xiao,
I didn't understand what the problem was, but here are some hints:
1. Your server.txt file must contain the correct information about the
server(s) you want to work with.
2. You run this for a second only (-t 1), and it will exit immediately, as it
did.
3. I see that you are using obj
Hello Kazi,
There is no need to be a root when running this benchmark.
Could you let me know the exact Memcached version you are using?
I guess you wanted to say "-S 30" in your command, rather than -S 3.
I see that you are using two servers at the same time. I suggest you create
two
files: s
if the segmentation fault is still there, please run
the client with gdb and send us
the stack trace once it crashes.
Regards,
Djordje
From: Kazi Sudipto Arif [sudipto.a...@gmail.com]
Sent: Tuesday, August 27, 2013 5:54 PM
To: Djordje Jevdjic
Subject: Re
Dear Reza,
To set up the Data Analytics benchmark you will need around 100GB of free disk
space
on one machine. You can remove the temporary files once you are done with the
set-up
phase. To run the benchmark on several machines, you will need less than 10GB
of disk
space per machine. In an
Hello,
Seems that you ran out of memory and your system is doing garbage collection
all the time.
You should try adjusting the number of concurrent map processes and/or the
amount of memory
allocated to each process.
Regards,
Djordje
From: Wu, Jie Ying
Hi Marco,
Thanks for your e-mail.
Indeed, the command line has an error. "-D" is used to configure the memory of
the client when during warmup.
On the server side, you should use "-M".
We will fix the documentation. Thanks again for pointing this out.
Regards,
Djordje
To: cloudsuite@listes.epfl.ch
Subject: Re: [datacaching] Bad command line option for memcached
On 11/14/2013 10:44 AM, Djordje Jevdjic wrote:
> Hi Marco,
>
> Thanks for your e-mail.
Hi Djordje,
Thank you for the reply.
> Indeed, the command line has an error. "-D" is used to conf
Hi Marco,
That command is used to quickly estimate the maximum throughput
a server can achieve, just to give you a hint for tuning. No need to
run it for a whole day. It is not important that the number is very
precise. Pick any. You need to play with the load on the client (using "-r"),
incre
Hello Sneha,
Nothing is supposed to be displayed. The server is running from the moment you
enter the command. You may even want to run it in the background, by adding "&"
at the end of the line.
Regards,
Djordje
From: Sneha Sathyanarayana [sneha.am...@g
Dear all,
We are happy to announce that we will hold an interactive tutorial at
ASPLOS in which you can learn about CloudSuite and the Flexus simulation
infrastructure. More importantly, you will have the opportunity to learn
how to correctly and rigorously evaluate server designs using real-wo
Hello Ahmad,
I took a look at your config files, they seem correct. The limitation of 14
mappers and 2 reducers is weird, suggests that you can't utilize more than 32GB
for whatever reason. I've been able to run with more processes on a weaker
machine. Have you ever been able to utilize more th
Dear Suhasini,
Naggle's algorithm is not related to the benchmark. It’s a TCP/IP optimization
that does not work well for this benchmark due to the size of the packets that
are transmitted between the client and the server. So, the default (and in this
case the best) option is to turn it off.
Hello Kun,
Here is the legend for the output:
timediff - the measurement period T (1s in your case)
rps - requests per second during the last T
requests - total number of requests completed within last the last T (if T=1s,
equals to rps)
gets - number of completed get requests during the last T
Dear all,
Just to remind you that the early registration deadline is for our
tutorial is February 10th and we still have a few empty slots for you!
The tutorial will be held in conjunction with ASPLOS'14 and you will have
the opportunity to learn about CloudSuite and the Flexus simulation
infr
Dear Chao,
The input file combines both the object popularity and the object size
distribution. That’s why the sizes are not sorted and some sizes may even
repeat.
Regards,
Djordje
From: Roy Lee [roy.q@gmail.com]
Sent: Thursday, March 06, 2014 8:16
Hello Wei,
If the 90th value of, for example, 2ms, means that 90% of the requests
experience a latency of up to 2ms. The same applies for 95th percentile.The
target latency you want to achieve depend on your frontend application, but
it’s typically 5-10ms.
Regarding the throughput, you first
Dear Hamza,
Thanks for your interest in CloudSuite.
Unfortunately, the client does not support multi-get requests. We do have plans
to implement that functionality (which is used in Facebook-like settings) at
some point in the future. It will probably happen with the next release of
CloudSuit
Dear Hang Lu,
The hardware requirements for Flexus jobs depend on the type of jobs you are
running and the workload. If you are running timing simulations with sampling,
a single job requires no more than 1GB of RAM per job and one core or hardware
thread (if hyperthreading should be enabled fo
,
Djordje
From: kishore kumar [kishoregupt...@gmail.com]
Sent: Tuesday, November 04, 2014 7:49 PM
To: Djordje Jevdjic
Cc: cloudsuite@listes.epfl.ch
Subject: Building datacache load tester on SPARC running Solaris
Hi,
First of all, thank you very much for making
From: moslem mosadegh [m.mosadeg...@gmail.com]
Sent: Tuesday, December 02, 2014 2:24 PM
To: Djordje Jevdjic
Subject: CloudStone guide
Hi.
I'm a master student of computer and working on cloudstone benchmark.
I serached in the net and found out your group "parsa&q
34 matches
Mail list logo