Nice!
Avery
On 5/12/12 2:58 AM, Sebastian Schelter wrote:
Hi,
I will give a talk titled Large Scale Graph Processing with Apache
Giraph in Berlin on May 29th. Details are available at:
I think you're right that the javadoc isn't specific enough.
* Use a registered aggregator in current superstep.
* Even when the same aggregator should be used in the next
* superstep, useAggregator needs to be called at the beginning
* of that superstep in preSuperstep().
*
*
Awesome! Congrats Eugene, we're excited to have you taking on a big role.
Avery
On 5/1/12 5:18 PM, Hyunsik Choi wrote:
Congrats and welcome Eugene!
I'm looking forward to your contribution.
--
Hyunsik Choi
On Wed, May 2, 2012 at 5:39 AM, Jakob Homan jgho...@gmail.com
,
On 11 Apr 2012, at 18:37, Avery Ching wrote:
There is no preferred way to represent labeled graphs. A close example to
your adjacency list idea is LongDoubleDoubleAdjacencyListVertexInputFormat.
Exactly. Giraph supports labeled Graphs very easily.
My reply is a little bit lat, so you probably
Very nice! Will these be similar to the 'Parallel Processing beyond
MapReduce' workshop after Berlin Buzzwords? It would be good to add at
leaset one of them to the page.
Avery
On 4/19/12 12:31 PM, Sebastian Schelter wrote:
Here are the slides of my talk Introducing Apache Giraph for Large
have no
job conf dating of the 13th. Does hadoop does not take the local time
to name the files?
Thanks,
Étienne
On 16 April 2012 19:45, Avery Ching ach...@apache.org
mailto:ach...@apache.org wrote:
Etienne, the task tracker logs are not what I meant, sorry for the
confusion
Hi Paulo,
Can you try something for me? I was able to get the PageRankBenchmark
to work running in local mode just fine on my side.
I think we should have some kind of a helper script (similar to
bin/giraph) to running simple tests in LocalJobRunner.
I believe that for LocalJobRunner to
Hi Etienne,
Thanks for your questions. Giraph uses map tasks to run its master and
workers. Can you provide the task output logs? It looks like your
workers failed to report status for some reason and we need to find out
why. The datanode logs can't help us here.
Avery
On 4/13/12 3:35
GiraphJob is not using TurtleVertexInputFormat.class and
TurtleVertexOutputFormat.class, but I don't see what I am doing
wrong. :-/
Thanks,
Paolo
[1]
https://github.com/castagna/jena-grande/blob/master/src/test/resources/log4j.properties
Avery Ching wrote:
I think the issue might
There is no preferred way to represent labeled graphs. A close
example to your adjacency list idea is
LongDoubleDoubleAdjacencyListVertexInputFormat.
Hope that helps,
Avery
On 4/11/12 10:00 AM, Paolo Castagna wrote:
Hi,
I am not sure what's the best way to represent labeled graphs in
I think the issue might be that Hadoop only logs INFO and above messages
by default. Can you retry with INFO level logging?
Avery
On 4/10/12 12:17 PM, Paolo Castagna wrote:
Hi,
I am still learning Giraph, so, please, be patient with me and forgive my
trivial questions.
As a simple initial
That is great news Sebastian! Congrats, I wish I was in Berlin to attend.
Avery
On 4/4/12 2:12 AM, Sebastian Schelter wrote:
Hi everybody,
I'd like to announce the 'Parallel Processing beyond MapReduce' workshop
which will take place directly after the Berlin Buzzwords conference (
If you're using one master and one slave, you need to do -w 1. Did you
see any error about the RPC server starting up?
Avery
On 4/3/12 1:37 PM, Robert Davis wrote:
Hello,
I was trying to run Giraph on two machines (one master and one slave)
but kept getting exceptions when establishing RPC
As Benjamin mentioned, it depends on the number of map tasks your hadoop
install is running with. You could set it proportionally to the number
of cores it has if you like, but try using Benjamin's suggestions to get
it working with more map tasks. I believe if you don't set the default,
the
Benjamin, my guess is that your jar might not have all the ZooKeeper
dependencies. Can you look at the log for the process that was supposed
to start ZooKeeper? I'm thinking it didn't start...
Avery
On 3/20/12 1:14 PM, Benjamin Heitmann wrote:
Hello,
after getting my feet wet with the
You can use it for performance testing, although it is not a great
simulation of real graphs. Real graphs tend to be more power law
distributed (see https://issues.apache.org/jira/browse/GIRAPH-26).
Hope that helps,
Avery
On 3/17/12 8:13 PM, Fleischman, Stephen (ISS SCI - Plano TX) wrote:
basically an abstract class and subclasses can
override methods to provide default values for vertices and edges
(otherwise values are initialized to null), just like Avery described
below. If you think it's useful I can contribute this.
On Wed, Mar 14, 2012 at 7:39 AM, Avery Ching ach...@apache.org
Hi Giraphers,
We have a submission for the 2012 Hadoop summit and part of deciding
whether it gets accepted is based on community voting. It would be
great to get more folks interested and involved in what is going on with
Giraph so please vote! Here's the link:
.
We'd love to have your contributions, it's a great fit. =)
Looking forward to your response!
Thanks!
On Mon, Mar 12, 2012 at 9:09 PM, Avery Ching ach...@apache.org
mailto:ach...@apache.org wrote:
Benjamin,
By the way, you're not the first to ask for a feature of this
kind
Sorry for the delayed response. Responses inline.
Avery
On 3/8/12 7:14 AM, Benjamin Heitmann wrote:
Hello again,
I am wondering if it would be possible to parse RDF input files from a
TextInputFormat class.
The most suitable text format for RDF is called NTriples, and it has this
very
Benjamin,
By the way, you're not the first to ask for a feature of this kind.
Perhaps we should consider an alternative format for loading input
vertex data that is based on the edges or data of the vertices rather
than totally vertex-centric. We could load an edge, or a vertex value
and
Inline responses. We look forward to hearing about your work Benjamin!
On 3/5/12 9:12 AM, Benjamin Heitmann wrote:
On 2 Mar 2012, at 23:15, Avery Ching wrote:
If I'm reading this right, you're using a public abstract class for the vertex.
The vertex class must be instantiable and cannot
Hi Abhishek,
Nice to meet you. Can you try it with less workers? For instance -w 1
or -w 2? I think the likely issue is that you need have as many map
slots as the number of workers + at least one master. If you don't have
enough slots, the job will fail. Also, you might want to dial
Sorry about the old documentation. I just updated the shortest paths
example. Before major changes to the graph distribution, the vertex ids
were required to be sorted. That is no longer the case. You can input
vertices in any order. The only restriction is that the vertex ids must
be
IntIntNullIntTextInputFormat in the examples package (extending
TextVertexInputFormat as David suggests) is very similar to what you
need I think, although the types might be different for your
application. You can start with that perhaps.
Avery
On 2/18/12 7:48 AM, David Garcia wrote:
The
Yes, there is a way to disable the counters at runtime.
See GiraphJob:
/** Use superstep counters? (boolean) */
public static final String USE_SUPERSTEP_COUNTERS =
giraph.useSuperstepCounters;
and set to false.
Avery
On 2/16/12 1:41 PM, David Garcia wrote:
I have a job that could
Hi Jeffrey,
Best attempt as answers inline.
On 2/16/12 6:12 PM, Jeffrey Yunes wrote:
Hi Giraph community,
I think I followed all of the directions (for a Giraph on a psuedo-cluster),
and it looks like
mvn clean test -Dprop.mapred.job.tracker=localhost:9001
runs fine. However, I'm new to
AFAIK we don't have any SOP for opening issues. Maybe I'll take a crack
at this one tonight if I find some time, unless you were planning to
work on it David.
Avery
On 2/8/12 5:46 PM, David Garcia wrote:
I opened up
* GIRAPH-144https://issues.apache.org/jira/browse/GIRAPH-144
I apologize
If you're using GiraphJob, the mapper class should be set for you.
That's weird.
Avery
On 2/7/12 5:58 PM, David Garcia wrote:
That's interesting. Yes, I don't need native libraries. The problem I'm
having is that after I run job.waitForCompletion(..),
The job runs a mapper that is
Thanks for the comments David. The behavior of what happens is
completely defined by the chosen VertexResolver, see
(GiraphJob#setWorkerContextClass). Developers can implement any
behavior they want. I believe the only reason to bypass was as a
performance optimization.
Avery
On 2/3/12
We can diverge from the Pregel API as long as we have a good reason for
it. I do agree that while we can support multi-graphs with a
user-chosen edge type, some built-in support that makes programming
easier sounds like a good goal. Andre or Claudio, feel free to open a
JIRA to discuss this.
To address the issues of binaries, could we release multiple binaries of
Giraph that coincide with the different versions of Hadoop?
On 1/31/12 7:44 PM, David Garcia wrote:
I think these concerns preclude the entire idea of a release. A release
should be something that users can use as a
Glad to hear you fixed your problem. It would be great if you could
describe any improvements that would help you have found the issues
earlier. Maybe we (or you) could add them =).
Avery
On 1/23/12 8:31 AM, André Kelpe wrote:
Hi all,
thanks for all the answers so far, it turns out that
/ find optimal configurations
for various regimes of problems, and would like to see Giraph succeed, so
let me know if there's any open issues which I might be able to dig into
(I'm on the dev mailing list as well, though haven't posted there).
Thanks,
Jon
On Dec 11, 2011, at 1:02 PM, Avery Ching
Would be great if you can document what you did. =)
Thanks,
Avery
On 11/8/11 3:13 PM, Claudio Martella wrote:
Sorry guys, may bad.
Was calling job.waitForCompletion() directly. I've been coding
standard mapreduce whole weekend...
Anyway I got a solution for clean packaging of your own
I use Eclipse and it's okay for running unittests, but I need to set the
VM args in the junit run configuration for each specific test to
-Dprop.jarLocation=target/giraph-0.70-jar-with-dependencies.jar. I
assume you need to do the same for Intellij.
This is done in pom.xml when doing 'mvn
Hi Gianmarco,
Welcome to Giraph! We definitely look forward to having your
input/contributions. Answers inline.
On 10/26/11 8:07 AM, Gianmarco De Francisci Morales wrote:
Hi,
First of all let me introduce myself, my name is Gianmarco and I am a
researcher.
Second, let me congratulate
The GraphLab model is more asynchronous than BSP They allow you to update
your neighbors rather than the BSP model of messaging per superstep. Rather
than one massive barrier in BSP, they implement this with vertex locking.
They also all a vertex to modify the state of its neighbors. We could
38 matches
Mail list logo