Hi, This may be a better question for the Cloudera Search mailing list.
Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Mon, Jan 6, 2014 at 11:06 AM, Huynh, Chi-Hao <hu...@initions.com> wrote: > Dear solr users, > > I would appreciate if someone can help me out here. My goal is to index a > csv-file. > > First of all, I am using the CDH 5 beta distribution of Hadoop, which > includes solr 4.4.0, on a single node. I am following the hue tutorial to > index and search the data from the yelp dataset challenge > http://gethue.tumblr.com/post/65969470780/hadoop-tutorials-season-ii-7-how-to-index-and-search > . > > Following the tutorial, I have uploaded the config files, including the > prepared schema.xml, to zookeeper via the solrctl-command: > >solrctl instancedir --create reviews [path to conf] > > After this, I have created the collection via: > >solrctl collection --create reviews -s 1 > > This works fine, as I can see the collection created in the Solr Admin Web > UI and the instancedir in the zookeeper shell. > > Then, using the MapReduceIndexerTool and the provided morphline file the > index is created and uploaded to solr. According to the command output the > index was created successfully: > > 1481 [main] INFO org.apache.solr.hadoop.MapReduceIndexerTool - Indexing > 1 files using 1 real mappers into 1 reducers > 52716 [main] INFO org.apache.solr.hadoop.MapReduceIndexerTool - Done. > Indexing 1 files using 1 real mappers into 1 reducers took 51.233 secs > 52774 [main] INFO org.apache.solr.hadoop.GoLive - Live merging of output > shards into Solr cluster... > 52829 [pool-4-thread-1] INFO org.apache.solr.hadoop.GoLive - Live merge > hdfs://svr-hdp01:8020/tmp/load/results/part-00000 into > http://SVR-HDP01:8983/solr > 53017 [pool-4-thread-1] INFO > org.apache.solr.client.solrj.impl.HttpClientUtil - Creating new http > client, > config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false > 53495 [main] INFO org.apache.solr.hadoop.GoLive - Committing live > merge... > 53496 [main] INFO org.apache.solr.client.solrj.impl.HttpClientUtil - > Creating new http client, config: > 53512 [main] INFO org.apache.solr.common.cloud.ConnectionManager - > Waiting for client to connect to ZooKeeper > 53513 [main-EventThread] INFO > org.apache.solr.common.cloud.ConnectionManager - Watcher > org.apache.solr.common.cloud.ConnectionManager@19014023name:ZooKeeperConnection > Watcher:SVR-HDP01:2181/solr got event WatchedEvent > state:SyncConnected type:None path:null path:null type:None > 53513 [main] INFO org.apache.solr.common.cloud.ConnectionManager - > Client is connected to ZooKeeper > 53514 [main] INFO org.apache.solr.common.cloud.ZkStateReader - Updating > cluster state from ZooKeeper... > 53652 [main] INFO org.apache.solr.hadoop.GoLive - Done committing live > merge > 53652 [main] INFO org.apache.solr.hadoop.GoLive - Live merging of index > shards into Solr cluster took 0.878 secs > 53652 [main] INFO org.apache.solr.hadoop.GoLive - Live merging completed > successfully > 53652 [main] INFO org.apache.solr.hadoop.MapReduceIndexerTool - > Succeeded with job: jobName: > org.apache.solr.hadoop.MapReduceIndexerTool/MorphlineMapper, jobId: > job_1388405934175_0013 > 53653 [main] INFO org.apache.solr.hadoop.MapReduceIndexerTool - Success. > Done. Program took 53.719 secs. Goodbye. > > Now, when I go to the web UI and select the created core, I find the core > to be empty, there are 0 number of Docs and querying it bears no result. My > question is, if I have to upload the csv-file manually to somewhere on the > solr server as it seems as if the csv-file was parsed and indexed > successfully, but the data is missing that was indexed. > > I hope, the description of the problem was clear enough. Thanks a lot! > Kind regards > > __________________ > initions AG > Chi-Hao Huynh > Weidestraße 120a > D-22081 Hamburg > > t: +49 (0) 40 / 41 49 60-62 > f: +49 (0) 40 / 41 49 60-11 > e: hu...@initios.com<mailto:hu...@initios.com> > w: www.initions.com<http://www.initions.com> > Vollständiger Name der Gesellschaft: initions innovative IT solutions AG > Sitz der Gesellschaft: Hamburg > Handelsregister Hamburg B 83929 > Aufsichtsratsvorsitzender: Dr. Michael Leue > Vorstand: Dr. Stefan Anschütz, André Paul Henkel, Dr. Helge Plehn > >