[jira] [Created] (GIRAPH-159) Case insensitive file/directory name matching will produce errors on M/R jar unpack.
Case insensitive file/directory name matching will produce errors on M/R jar unpack. - Key: GIRAPH-159 URL: https://issues.apache.org/jira/browse/GIRAPH-159 Project: Giraph Issue Type: Bug Components: build Affects Versions: 0.2.0 Environment: OSX 10.6.8 Reporter: Brian Femiano This only seems to affect platforms where there can be a file/directory naming conflicts from case insensitive matches. I was able to reproduce running the pseudo-distributed unit tests within OSX. This has affected other projects: https://issues.apache.org/jira/browse/MAHOUT-780 I've been able to reproduce this on my local OSX install with the following error: https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/a201218000e956d3/cc6eca3ef9f80ff8 Since LICENSE.txt contains the same content as the file LICENSE, I propose we exclude any LICENSE matches found in the unpacked dependency jars when the maven assembly phase hits 'jar-with-dependencies'. I have a patch which moves the 'jar-with-dependencies' descriptor to an external compile.xml file which has the proper excludes. This might also come in handy down the road should any additional tweaks be needed to the compile phase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-157) Vertex to perform graph coloring on simple, connected, undirected graphs and related test.
[ https://issues.apache.org/jira/browse/GIRAPH-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232835#comment-13232835 ] Eli Reisman commented on GIRAPH-157: Hi Sebastian, Yes I make no claims its in polynomial time or a perfect algorithm. It does seem to pass all the test graphs i put it too, but they are small. I would like to test it on a larger graph but I need to generate or find one that I have a worked out coloring for. Even in the small tests, I had to work out colorings to make sure the one it chose was a real choice and I had to confirm on wikipedia that the one it found was also a minimal coloring for that graph. The algorithm just says, everyone starts with no color. Hardcoded (for now) SOURCE_VERTEX_ID vertex is lucky guy, chooses FIRST_COLOR and messages others. From here on we do supersteps until everyone votes to halt. On each superstep, messages are passed to a vertex if one of its neighbors changes color. Each vertex is keeping track of a reference count of how many neighbors occupy which colors. Any round where messages come in, we end up checking to see if we need to change color. Any round where a message contains a color conflict with us, we oblige by choosing a new color. Choosing a new color always consists of looking for the lowest unoccupied color the neighbors aren't currently occupying. If we end up choosing a new color and it doesn't end up beign the color we already are, then we always message all neighbors regarding our old color (to remove a ref count) and our new one (to add a ref count and recalc their color if need be). The trick is when you get conflicting color messages directly with your own color. As far as I could figure, this should only happen when you and one or more of your neighbors chose the same color at the same time (the last superstep) and in this case, I let the one with the lowest vertexId break the tie and keep the color. This is arbitrary, but seems to work as these ties must be broken and since everyone tries to choose a new color targeting the lowest color available according to their own reference counts, these conflicts tend to settle out correctly in the end (so far!). One possible optimization (or not?) involves letting the conflict count (a count of how many of the current neighbors I am in simultaneous color conflict with who have a higher (weaker) vertexID than the current vertex) help me guess at my next color choice. this should reduce the number of supersteps required to resolve a worst-case N-way conflict (which i believe is currently about N supersteps). I am speaking of course of an isolated conflict. I do have the sneaking suspicion that given a graph that is still simple, undirected, and connected (thats all i made promises for!) of sufficient size and complexity, it might be possible to tangle this thing up in some feedback loop somehow? Further, I would suspect that this ends up being some sort of heuristic close-guess at a minimal color in some complex situations too. So, I'm sure its not super fast and I'm not at all sure it works at scale, but it does seem fairly simple and it seems to work so far. Does someone know some slicker algorithms that we can convert to a think like a vertex paradigm? To be clear, this algorithm only colors vertices too -- I don't know how to think like an edge yet at all! Thanks, I welcome the feedback Eli Vertex to perform graph coloring on simple, connected, undirected graphs and related test. -- Key: GIRAPH-157 URL: https://issues.apache.org/jira/browse/GIRAPH-157 Project: Giraph Issue Type: Test Components: examples, test Affects Versions: 0.2.0 Reporter: Eli Reisman Assignee: Eli Reisman Priority: Trivial Labels: newbie Attachments: GIRAPH-157.patch Hi. I am attempting to learn the Hadoop and Giraph codebases and wanted to write a simple client application for Giraph to help me learn the ins and outs of it. This is a simple unit test and vertex modeled after the ConnectedComponentsVertex and related test. The vertex test runs whenever you run the mvn test or mvn verify suite of tests. When finished processing, each vertex will have an integer value that is its color. This is a pretty simple implementation, and although I have tested it on a number of small graphs of varied trickiness and it seems to rapidly arrive at a minimal coloring, its hard (for me at least) to guess which possible coloring it will arrive at and I have no idea how it will do on really big graphs yet without finding some more pre-colored larger test graphs to try it on. Ideas anyone? Anyway, it was fun to put this together, and I'd
[jira] [Updated] (GIRAPH-159) Case insensitive file/directory name matching will produce errors on M/R jar unpack.
[ https://issues.apache.org/jira/browse/GIRAPH-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-159: - Attachment: GIRAPH-159.patch Case insensitive file/directory name matching will produce errors on M/R jar unpack. - Key: GIRAPH-159 URL: https://issues.apache.org/jira/browse/GIRAPH-159 Project: Giraph Issue Type: Bug Components: build Affects Versions: 0.2.0 Environment: OSX 10.6.8 Reporter: Brian Femiano Priority: Minor Attachments: GIRAPH-159.patch This only seems to affect platforms where there can be a file/directory naming conflicts from case insensitive matches. I was able to reproduce running the pseudo-distributed unit tests within OSX. This has affected other projects: https://issues.apache.org/jira/browse/MAHOUT-780 I've been able to reproduce this on my local OSX install with the following error: https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/a201218000e956d3/cc6eca3ef9f80ff8 Since LICENSE.txt contains the same content as the file LICENSE, I propose we exclude any LICENSE matches found in the unpacked dependency jars when the maven assembly phase hits 'jar-with-dependencies'. I have a patch which moves the 'jar-with-dependencies' descriptor to an external compile.xml file which has the proper excludes. This might also come in handy down the road should any additional tweaks be needed to the compile phase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-159) Case insensitive file/directory name matching will produce errors on M/R jar unpack.
[ https://issues.apache.org/jira/browse/GIRAPH-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-159: - Attachment: (was: GIRAPH-159.patch) Case insensitive file/directory name matching will produce errors on M/R jar unpack. - Key: GIRAPH-159 URL: https://issues.apache.org/jira/browse/GIRAPH-159 Project: Giraph Issue Type: Bug Components: build Affects Versions: 0.2.0 Environment: OSX 10.6.8 Reporter: Brian Femiano Priority: Minor This only seems to affect platforms where there can be a file/directory naming conflicts from case insensitive matches. I was able to reproduce running the pseudo-distributed unit tests within OSX. This has affected other projects: https://issues.apache.org/jira/browse/MAHOUT-780 I've been able to reproduce this on my local OSX install with the following error: https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/a201218000e956d3/cc6eca3ef9f80ff8 Since LICENSE.txt contains the same content as the file LICENSE, I propose we exclude any LICENSE matches found in the unpacked dependency jars when the maven assembly phase hits 'jar-with-dependencies'. I have a patch which moves the 'jar-with-dependencies' descriptor to an external compile.xml file which has the proper excludes. This might also come in handy down the road should any additional tweaks be needed to the compile phase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-159) Case insensitive file/directory name matching will produce errors on M/R jar unpack.
[ https://issues.apache.org/jira/browse/GIRAPH-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-159: - Attachment: compile.xml GIRAPH-159.patch Case insensitive file/directory name matching will produce errors on M/R jar unpack. - Key: GIRAPH-159 URL: https://issues.apache.org/jira/browse/GIRAPH-159 Project: Giraph Issue Type: Bug Components: build Affects Versions: 0.2.0 Environment: OSX 10.6.8 Reporter: Brian Femiano Priority: Minor Attachments: GIRAPH-159.patch, compile.xml This only seems to affect platforms where there can be a file/directory naming conflicts from case insensitive matches. I was able to reproduce running the pseudo-distributed unit tests within OSX. This has affected other projects: https://issues.apache.org/jira/browse/MAHOUT-780 I've been able to reproduce this on my local OSX install with the following error: https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/a201218000e956d3/cc6eca3ef9f80ff8 Since LICENSE.txt contains the same content as the file LICENSE, I propose we exclude any LICENSE matches found in the unpacked dependency jars when the maven assembly phase hits 'jar-with-dependencies'. I have a patch which moves the 'jar-with-dependencies' descriptor to an external compile.xml file which has the proper excludes. This might also come in handy down the road should any additional tweaks be needed to the compile phase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira