Hi Bikash,
Have you tried adding hdfs:// to your input path? Maybe that helps.
--sebastian
On 03/11/2014 11:22 AM, Bikash Gupta wrote:
Hi,
I am running Kmeans in cluster where I am setting the configuration of
fs.hdfs.impl and fs.file.impl before hand as mentioned below
Hi,
Problem is not with input path, its the way Kmeans is getting executed. Let
me explain.
I have created CSV-Sequence using map-reduce hence my data is in HDFS
After this I have run Canopy MR hence data is also in HDFS
Now these two things are getting pushed in Kmeans MR.
If you check
Hi,
As you've probably noticed, I've put in a lot of effort over the last
days to kickstart cleaning up our website. I've thrown out a lot of
stuff and have been startled by the amout of outdated and incorrect
information on our website, as well as links pointing to nowhere.
I think our
Hi Sebastian,
I am afraid I am only familiar with the recommendation part.
In previous posts, I pointed a couple of errors in this wiki page:
https://cwiki.apache.org/confluence/display/MAHOUT/Quick+tour+of+text+analysis+using+the+Mahout+command+line
If you are planning to keep it in the new
i ll help with clustering algorithms documentation. do send me old
documentation and i will check and remove errors. or better let me know
how to proceed.
Pavan
On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote:
Hi,
As you've probably noticed, I've put in a lot of effort
hi. just read the whole email just now as earlier i was travelling. i am on
it.
On Mar 12, 2014 12:35 PM, Sebastian Schelter s...@apache.org wrote:
Hi,
As you've probably noticed, I've put in a lot of effort over the last days
to kickstart cleaning up our website. I've thrown out a lot of
I can confirm what Sebastian said, I'm fairly new on this and I did find
myself so desperate at some point that I almost gave up on Mahout dut to
lack of documentation, but my feeling is that it doesn't only concerns the
website : the API is too few documented as well. At this point there are no
We don't exactly have that page, but we have pages that touch parts of
it, such as
https://mahout.apache.org/users/basics/creating-vectors-from-text.html
It would be great if you could create a jira ticket which lists the
errors. I'll fix them then.
Best,
Sebastian
On 03/12/2014 08:42 AM,
Hi Pavan,
Awesome that you're willing to help. The documentation are the pages
listed under Clustering in the navigation bar under mahout.apache.org
If you start working on one of the pages listed there (e.g. the k-Means
doc), please created jira ticket in our issue tracker with a title
Hi Manoj,
Awesome that you're willing to help.
I suggest we proceed analogously to the clustering cleanup:
The documentation are the pages listed under Classification in the
navigation bar under mahout.apache.org
If you start working on one of the pages listed there (e.g. the Naive
Bayes
Hi Kevin, go to eclipse market place and install m2eclipse . after you do
a mvn install on your mahout, import the compiled mahout.
I ll try to write detailed documentation with screenshots but for the
moment use the above as starting point.
On 12 March 2014 15:29, Sebastian Schelter
Hi Kevin,
Thank you for offer to help! Feel free to ask questions here how to
setup the sources in Eclipse. If you succeed, you could writeup what you
did and we could add this to the website, as I'm sure a lot of others
will have the same problem.
It would be great if you could start
Hi All,
I would also like to participate in cleaning up the documentation.
Since, I am fairly new to the Mahout infrastructure. It will in-turn help
me understand things better. Do we already have a Jira ticket for
organizing the cleaning up of documentation ?
Just want to be sure, that I am
Here you can see all issues (resolved and unresolved) for the next release:
https://issues.apache.org/jira/browse/MAHOUT-1413?jql=project%20%3D%20MAHOUT%20AND%20fixVersion%20%3D%201.0%20ORDER%20BY%20priority%20DESC
When you start to work on the cleanup of a page, make sure that there is
no
Thanks, I'll do that partly on my free time since I'm working on other
things at work right now :)
Kévin Moulart
2014-03-12 11:07 GMT+01:00 Sebastian Schelter s...@apache.org:
Hi Kevin,
Thank you for offer to help! Feel free to ask questions here how to setup
the sources in Eclipse. If you
Thanks Sebastian, that's a great help.
#Pramit
On Wed, Mar 12, 2014 at 3:37 AM, Kevin Moulart kevinmoul...@gmail.comwrote:
Thanks, I'll do that partly on my free time since I'm working on other
things at work right now :)
Kévin Moulart
2014-03-12 11:07 GMT+01:00 Sebastian Schelter
Should I raise JIRA ?
On Wed, Mar 12, 2014 at 12:31 PM, Bikash Gupta bikash.gupt...@gmail.comwrote:
Hi,
Problem is not with input path, its the way Kmeans is getting executed.
Let me explain.
I have created CSV-Sequence using map-reduce hence my data is in HDFS
After this I have run
I took the tour of the text analysis and pushed through despite the
problems on the page. Commiters helped me over the hump where others
might have just gave up (to your point).
When I did it, I made shell scripts so that my steps would be repeatable
with an anticipation of updating the page.
I’ll make it work.
Don’t know markdown (assume some reduced mark”up” language) - but I’ll
figure it out. I will assume that I can check with my consulting buddy
“Google” and find it. :)
Thank you for your contributions - glad that I can give “something” back.
I’ll start off by sending the doc to
Yes please; if you're seeing confusing behavior when you leave the hdfs
protocol off the URI then it may need some tending.
On Mar 12, 2014, at 7:22 AM, Bikash Gupta bikash.gupt...@gmail.com wrote:
Should I raise JIRA ?
On Wed, Mar 12, 2014 at 12:31 PM, Bikash Gupta
Thanks Scott; please just attach your work to an issue in the Jira system; if
there's not one already you could file a new issue.
On Mar 12, 2014, at 7:44 AM, Scott C. Cote scottcc...@gmail.com wrote:
I’ll make it work.
Don’t know markdown (assume some reduced mark”up” language) - but I’ll
ok
On 3/12/14, 9:58 AM, Andrew Musselman andrew.mussel...@gmail.com wrote:
Thanks Scott; please just attach your work to an issue in the Jira
system; if there's not one already you could file a new issue.
On Mar 12, 2014, at 7:44 AM, Scott C. Cote scottcc...@gmail.com
wrote:
I’ll make it
Hi,
I tried to fix all the problem I had to configure eclipse in order to
compile mahout in it using maven clean package as goal.
First I had to make a change in mahout core in the class GroupTree.java,
line 171 :
stack = new ArrayDequeGroupTree();
Then I tried compiling with eclipse (I
Never mind, I found where the problem lied, I deleted the full content of
.m2 and retried it as non root user and it worked. Trying in Eclipse now,
with tests I'll let you now if it doesn't work.
Kévin Moulart
2014-03-12 16:45 GMT+01:00 Kevin Moulart kevinmoul...@gmail.com:
Hi,
I tried to
MAHOUT-1452 has been raised
On Wed, Mar 12, 2014 at 8:26 PM, Andrew Musselman
andrew.mussel...@gmail.com wrote:
Yes please; if you're seeing confusing behavior when you leave the hdfs
protocol off the URI then it may need some tending.
On Mar 12, 2014, at 7:22 AM, Bikash Gupta
Hi,
Finding out right T1 and T2 in canopy is time taking task with manual
intervention. I am planning to automate the process of calculation.
Idea is I would increment T1 and T2 by x times of 3.1 and x times of 2.1,
and would collect the approx T1 and T2 for each K cluster.
Not sure if this is
Is there any rational to what u r proposing?
Its better to go with Streaming KMeans than the combination of Canopy - KMeans
clustering.
Moreover, Canopy clustering (due to a single reducer in Canopy Generation
phase) is more likely to fail with large datasets and that's a behavior that's
Not exactly, I was trying to build a logic for this calculation, but before
that I thought to take suggestion from everyone.
Anyways will give a try with Streaming KMeans.
On Thu, Mar 13, 2014 at 3:43 AM, Suneel Marthi suneel_mar...@yahoo.comwrote:
Is there any rational to what u r
28 matches
Mail list logo