Hello, Perhaps could be useful for you to start with the features provides from some datasets, as these ones:
http://archive.ics.uci.edu/ml/datasets/Corel+Image+Features http://archive.ics.uci.edu/ml/datasets/Image+Segmentation Both dataset stores some values (color histograms, contrasts, textures...). Perhaps you could use these data to test your algorithm...I mean if you a have an image, give me the other ones which are more similar based on these features. 2010/10/3 gagan chhabra <[email protected]>: > Thanx alot. I'll try to do it and will keep posting my status as well as > queries.:P > > I have to figure out, what exactly does the paper explains. There are some > points not clear yet. > > > > On Mon, Oct 4, 2010 at 12:14 AM, Ted Dunning <[email protected]> wrote: > >> Yes. Extract your features using OpenCV, then use matlab or R for your >> clustering. >> >> What you need is good prototyping, not large-scale data mining. Mahout is >> intended for scale, >> not necessarily ease of use. You should focus on the problems you have and >> not add additional >> ones like learning a large library such as Mahout. >> >> On Sun, Oct 3, 2010 at 11:41 AM, gagan chhabra <[email protected] >> >wrote: >> >> > I was proposed yo use MATLAB for this project but I had no idea so i >> > somehow >> > ended up here. >> > Is it possible to implement in MATLAB?? >> > >> > >> > >> > On Sun, Oct 3, 2010 at 11:48 PM, Ted Dunning <[email protected]> >> > wrote: >> > >> > > This paper had some interesting references. The problem they worked on >> > was >> > > different from yours, but if you >> > > know something abou the training images, this might work out. The >> > > something >> > > might be the original web-site >> > > nearby text or almost anything. >> > > >> > > >> http://www.public.asu.edu/~huanliu/.../SBP09_3-31(Baoxin%20Li%20-4).pdf >> > > >> > > THis paper describes the use of Gabor transforms and histograms for >> image >> > > clustering: >> > > >> > > http://www-nlpir.nist.gov/projects/tvpubs/tv6.papers/eurecom.pdf >> > > >> > > HSV histogram clustering might be a reasonable scale effort for a >> student >> > > project. >> > > >> > > Another approach is to try a latent factor method to characterize >> images. >> > > This paper describes an image completion task on a handwritten digit >> > > dataset. I am pretty sure that clustering on these latent features >> would >> > > give very nice clustering because they inherently have a Euclidean >> metric >> > > imposed on them. >> > > >> > > http://arxiv.org/abs/1006.2156 >> > > >> > > The recommendation that you use OpenCV for image extraction is a very >> > good >> > > one. You might want to use Mahout for clustering, but I doubt you will >> > > have >> > > enough images to make that worth-while. Just extracting useful >> features >> > > will take a long time. >> > > >> > > On Sun, Oct 3, 2010 at 10:33 AM, gagan chhabra < >> [email protected] >> > > >wrote: >> > > >> > > > Hello Steven Bourke, >> > > > >> > > > The data is actually not text. Query is an Image and database again >> of >> > > > images. >> > > > >> > > > I wanted to know how can one declare one image similar to another, in >> > > > programming terms. I mean there has to some parameter of analysis or >> > > > algorithm which can solve this problem. >> > > > >> > > > >> > > > >> > > > On Sun, Oct 3, 2010 at 10:44 PM, Steven Bourke <[email protected]> >> > > wrote: >> > > > >> > > > > Where is the semantic data coming from? I think something like >> lucene >> > > > would >> > > > > be more relevant if you are searching text based on available meta >> > > data. >> > > > > >> > > > > On Sun, Oct 3, 2010 at 6:54 PM, Sean Owen <[email protected]> >> wrote: >> > > > > >> > > > > > You probably want to look at Shannon's spectral clustering code? >> > > > That's >> > > > > > the >> > > > > > closest thing I can think of in Mahout. It doesn't have much of >> > > > anything >> > > > > > for image processing. >> > > > > > >> > > > > > On Sun, Oct 3, 2010 at 5:02 PM, gagan chhabra < >> > > > [email protected] >> > > > > > >wrote: >> > > > > > >> > > > > > > Hello all, >> > > > > > > >> > > > > > > I am a Engineering candidate and took a project which is based >> on >> > > > > Machine >> > > > > > > Learning. The idea is to Query-by-Image, it is a research paper >> > by >> > > > > > > Googlers. >> > > > > > > I am not getting any point to start off. >> > > > > > > >> > > > > > > I don know if Mahout is of any use to me but since it is meant >> > for >> > > > > > Machine >> > > > > > > Learnig I joined to know more about it. >> > > > > > > >> > > > > > > My application will go like: >> > > > > > > > User eneters a query( which is an image). >> > > > > > > >> > > > > > > > Then the application searches for other images in database >> > with >> > > > same >> > > > > > > semantic. >> > > > > > > for example- if user enter an image of dog the app will >> retrieve >> > > > other >> > > > > > > images of dog >> > > > > > > or if user enters an image of snowy-mountain it retrieves >> simila >> > > > image. >> > > > > > > >> > > > > > > So i don get how to compare images. What metric to use to >> > declare >> > > > any >> > > > > > > image >> > > > > > > similar to query image. >> > > > > > > >> > > > > > > Please suggest something... any help will make a huge >> difference. >> > > > > > > >> > > > > > > -- >> > > > > > > gagan >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > >> > > > >> > > > -- >> > > > gagan >> > > > >> > > >> > >> > >> > >> > -- >> > gagan >> > >> > > > > -- > gagan >
