Hi,
I managed to fix it even if I'm still not entirely sure what happened.
The fix is to make a new Text object every time a Text is required as input
(Text does not implement Cloneable). I guess it
So instead of:
Text candidate = e.getTargetVertexId();
...
vertex.setValue(candidate))
The working version is:
Text candidate = e.getTargetVertexId();
...
vertex.setValue(new Text(candidate.toString()))
@Override
public void compute(Vertex<Text, Text, NullWritable> vertex,
Iterable<Text> inmessage) throws IOException {
boolean changed = false;
// first superstep && setup
if (getSuperstep() == 0) {
//initialize value
vertex.setValue(new Text(vertex.getId().toString()));
//cheating by checking the neighbors ID's (cuts down 1
iteration)
for (Edge<Text, NullWritable> e : vertex.getEdges()) {
Text candidate = e.getTargetVertexId();
if (candidate.compareTo(vertex.getValue()) < 0) {
changed = true;
vertex.setValue(new Text(candidate.toString()));
}
}
}
// other superstep
else {
// read all messages and compare with own state
for (Text message : inmessage) {
if (message.compareTo(vertex.getValue()) < 0) {
changed = true;
vertex.setValue(new Text(message.toString()));
}
}
}
// if state has changed send a message to all neighbors
if (changed && getSuperstep() < limiter) {
sendMessageToAllEdges(vertex, new
Text(vertex.getValue().toString()));
}
vertex.voteToHalt();
}
On Mon, Mar 3, 2014 at 2:19 PM, Sebastian Schelter <[email protected]> wrote:
> Hi Martin,
>
> I'm not sure wether we require InputFormats to be threadsafe. Can someone
> answer that question?
>
> Maybe thats the reason you see this behavior.
>
> --sebastian
>
>
>
> On 03/03/2014 10:05 AM, Martin Neumann wrote:
>
>> I checked the input just creating the graph and comparing it. While I cant
>> say the graph is correct (its to big) its at least consistent.
>>
>> So the only things where the different output can come from is the
>> connected component part (see code further down). I'm completely stomped,
>> the code is basically the example code Giraph ships with all I changed was
>> replacing the IntWriteable Id's with Text.
>>
>> Anyone has any Idea what the problem could be, I'm running out of Idea's.
>>
>> cheers Martin
>>
>>
>>
>> @Override
>> public void compute(Vertex<Text, Text, NullWritable> vertex,
>> Iterable<Text> inmessage) throws IOException {
>> boolean changed = false;
>>
>> // first superstep && setup
>> if (getSuperstep() == 0) {
>> //initialize value
>> vertex.setValue(vertex.getId()
>> );
>> //cheating by checking the neighbors ID's (cuts down 1
>> iteration)
>> for (Edge<Text, NullWritable> e : vertex.getEdges()) {
>> Text candidate = e.getTargetVertexId();
>> if (candidate.compareTo(vertex.getValue()) < 0) {
>> changed = true;
>> vertex.setValue(candidate);
>> }
>> }
>> }
>>
>> // other superstep
>> else {
>> // read all messages and compare with own state
>> for (Text message : inmessage) {
>> if (message.compareTo(vertex.getValue()) < 0) {
>> changed = true;
>> vertex.setValue(message);
>> }
>> }
>> }
>>
>> // if state has changed send a message to all neighbors
>> if (changed && getSuperstep() < limiter) {
>> sendMessageToAllEdges(vertex, vertex.getValue());
>> }
>> vertex.voteToHalt();
>> }
>>
>>
>>
>> On Mon, Mar 3, 2014 at 9:56 AM, Martin Neumann <[email protected]>
>> wrote:
>>
>> Hey,
>>>
>>> sorry, yes I intended to send it to the mailing-list, thanks for the hint
>>> (I repost it to the list). Guess I should stop working in the middle of
>>> the
>>> night. :-)
>>>
>>> I tested the input format and its output is consistent, I always get the
>>> same graph when running it. So the bug has to be in the connected
>>> component
>>> part since that one gives me different outputs for the same input graph.
>>>
>>> cheers Martin
>>>
>>>
>>> On Mon, Mar 3, 2014 at 8:06 AM, Sebastian Schelter <[email protected]>
>>> wrote:
>>>
>>> Martin,
>>>>
>>>> can you write a MapReduce job that creates your graph and run it with a
>>>> simpler inputformat?
>>>>
>>>> I really suspect that the bug lies somewhere in your input format.
>>>>
>>>> --sebastian
>>>>
>>>>
>>>> On 03/02/2014 09:48 PM, Martin Neumann wrote:
>>>>
>>>> I checked the input just creating the graph and comparing it. While I
>>>>> cant
>>>>> say the graph is correct (for its to big) its at least consistent.
>>>>>
>>>>> So the only things where the different output can come from is the
>>>>> connected component part (see code in the first mail). I'm completely
>>>>> stomped, the code is basically the example code Giraph ships with all I
>>>>> changed was replacing the IntWriteable Id's with Text.
>>>>>
>>>>> Anyone has any Idea what the problem could be, I'm running out of
>>>>> Idea's.
>>>>>
>>>>> cheers Martin
>>>>>
>>>>>
>>>>> On Thu, Feb 27, 2014 at 4:59 PM, Sebastian Schelter <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Hi Martin
>>>>>
>>>>>>
>>>>>> I don't think that there are problems with comparing and sorting Text
>>>>>> writables as Hadoop is basically a big external sorting system.
>>>>>>
>>>>>> I'm not sure I understand your edge input reader, it looks very
>>>>>> complex,
>>>>>> maybe there's a bug somewhere. You could try to preprocess your data
>>>>>> using
>>>>>> Hadoop so that you can use a simple VertexInputFormat and see if your
>>>>>> problems still occur.
>>>>>>
>>>>>> --sebastian
>>>>>>
>>>>>>
>>>>>> On 02/27/2014 04:41 PM, Martin Neumann wrote:
>>>>>>
>>>>>> Hm
>>>>>>
>>>>>>>
>>>>>>> I ran the job 5 times and made a diff between the outputs and they
>>>>>>> are
>>>>>>> not
>>>>>>> the same. I cant find anything in the code that could lead to this
>>>>>>> behaviour.
>>>>>>>
>>>>>>> The only idea where to look a the moment would be the identifier. Has
>>>>>>> anyone experience with String identifier?
>>>>>>> Is a possible that there are problems with comparing and sorting
>>>>>>> TextWritables?
>>>>>>>
>>>>>>> cheers Martin
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>