Apache Solr Reference Guide 5.0

2015-03-06 Thread Patrick Durusau

Greetings,

I was looking at the PDF version of the Apache Solr Reference Guide 5.0 
and noticed that it has no TOC nor any section numbering. 
http://apache.claz.org/lucene/solr/ref-guide/apache-solr-ref-guide-5.0.pdf


The lack of a TOC and section headings makes navigation difficult.

I have just started making suggestions on the documentation and was 
wondering if there is a reason why the TOC and section headings are 
missing? (that isn't apparent from the document)


Thanks!

Hope everyone is near a great weekend!

Patrick


Re: Apache Solr Reference Guide 5.0

2015-03-06 Thread Patrick Durusau

Shawn,

Thanks!

I was using Document Viewer and not Adobe Acrobat so was unclear.

The TOC I meant was as in a traditional print publication with section 
#s, etc. Not a navigation TOC sans numbering as in Adobe.


The Confluence documentation (I can't see the actual stylesheet in use, 
I don't think) here:


https://confluence.atlassian.com/display/DOC/Customising+Exports+to+PDF

Says:

*
Disabling the Table of Contents

To prevent the table of contents from being generated in your PDF 
document, add the div.toc-macro rule to the PDF Stylesheet and set its 
display property to none:

*

Which is why I was asking if there was a reason for the TOC and section 
numbering not appearing.


They can be defeated but that doesn't appear to be the default setting.

This came up because a section said it would cover topics N - S and I 
could not determine if all those topics fell in that section or not.


Thanks!

Hope you are having a great day!

Patrick

On 03/06/2015 12:28 PM, Shawn Heisey wrote:

On 3/6/2015 10:20 AM, Patrick Durusau wrote:

I was looking at the PDF version of the Apache Solr Reference Guide
5.0 and noticed that it has no TOC nor any section numbering.
http://apache.claz.org/lucene/solr/ref-guide/apache-solr-ref-guide-5.0.pdf

The lack of a TOC and section headings makes navigation difficult.

I have just started making suggestions on the documentation and was
wondering if there is a reason why the TOC and section headings are
missing? (that isn't apparent from the document)

The TOC is built into the PDF and it's up to the PDF viewer to display it.

Here's a screenshot of the ref guide in Adobe Reader with a clickable
TOC open.

https://www.dropbox.com/s/3ajuri1emj61imu/refguide-5.0-TOC.png?dl=0

Section numbering might be a good idea, if it's not too intrusive or
difficult.

Thanks,
Shawn






Solr-3.5.0/Nutch-1.4 - SolrDeleteDuplicates fails

2011-12-12 Thread Patrick Durusau

Greetings!

On the Nutch Tutorial:

I can run the following commands with Solr-3.5.0/Nutch-1.4:

bin/nutch crawl urls -dir crawl -depth 3 -topN 5


then:

bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb -linkdb 
crawl/linkdb crawl/segments/*



successfully.

But, if I run:

bin/nutch crawl urls -solr http://localhost:8983/solr/ -depth 3 -topN 5

It fails with the following messages:

SolrIndexer: starting at 2011-12-11 14:01:27

Adding 11 documents

SolrIndexer: finished at 2011-12-11 14:01:28, elapsed: 00:00:01

SolrDeleteDuplicates: starting at 2011-12-11 14:01:28

SolrDeleteDuplicates: Solr url: http://localhost:8983/solr/

Exception in thread main java.io.IOException: Job failed!

at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)

at 
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:373)


at 
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:353)


at org.apache.nutch.crawl.Crawl.run(Crawl.java:153)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)

I am running on Ubuntu 10.10 with 12 GB of memory, Java version 1.6.0_26.

I can delete the crawl directory and replicate this error consistently.

Suggestions?

Other than ...use the way that doesn't fail. ;-)

I am concerned that a different invocation of Solr failing consistently 
represents something that may cause trouble elsewhere when least 
expected. (And hard to isolate as the problem.)


Thanks!

Hope everyone is having a great weekend!

Patrick

PS: From the hadoop log (when it fails) if that's helpful:

2011-12-11 15:21:51,436 INFO  solr.SolrWriter - Adding 11 documents

2011-12-11 15:21:52,250 INFO  solr.SolrIndexer - SolrIndexer: finished 
at 2011-12-11 15:21:52, elapsed: 00:00:01


2011-12-11 15:21:52,251 INFO  solr.SolrDeleteDuplicates - 
SolrDeleteDuplicates: starting at 2011-12-11 15:21:52


2011-12-11 15:21:52,251 INFO  solr.SolrDeleteDuplicates - 
SolrDeleteDuplicates: Solr url: http://localhost:8983/solr/


2011-12-11 15:21:52,330 WARN  mapred.LocalJobRunner - job_local_0020

java.lang.NullPointerException

at org.apache.hadoop.io.Text.encode(Text.java:388)

at org.apache.hadoop.io.Text.set(Text.java:178)

at 
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:270)


at 
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:241)


at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)


at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)


at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)

at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)

at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)


--
Patrick Durusau
patr...@durusau.net
Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
OASIS Technical Advisory Board (TAB) - member

Another Word For It (blog): http://tm.durusau.net
Homepage: http://www.durusau.net
Twitter: patrickDurusau



Solr-3.5.0/Nutch-1.4 - SolrDeleteDuplicates fails

2011-12-11 Thread Patrick Durusau

Greetings!

This may be a Nutch question and if so, I will repost to the Nutch list.

I can run the following commands with Solr-3.5.0/Nutch-1.4:

bin/nutch crawl urls -dir crawl -depth 3 -topN 5


then:

bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb -linkdb 
crawl/linkdb crawl/segments/*


successfully.

But, if I run:

bin/nutch crawl urls -solr http://localhost:8983/solr/ -depth 3 -topN 5

It fails with the following messages:

SolrIndexer: starting at 2011-12-11 14:01:27

Adding 11 documents

SolrIndexer: finished at 2011-12-11 14:01:28, elapsed: 00:00:01

SolrDeleteDuplicates: starting at 2011-12-11 14:01:28

SolrDeleteDuplicates: Solr url: http://localhost:8983/solr/

Exception in thread main java.io.IOException: Job failed!

at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)

at 
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:373)

at 
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:353)

at org.apache.nutch.crawl.Crawl.run(Crawl.java:153)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)

I am running on Ubuntu 10.10 with 12 GB of memory, Java version 1.6.0_26.

I can delete the crawl directory and replicate this error consistently.

Suggestions?

Other than ...use the way that doesn't fail. ;-)

I am concerned that a different invocation of Solr failing consistently 
represents something that may cause trouble elsewhere when least 
expected. (And hard to isolate as the problem.)


Thanks!

Hope everyone is having a great weekend!

Patrick

PS: From the hadoop log (when it fails) if that's helpful:

2011-12-11 15:21:51,436 INFO  solr.SolrWriter - Adding 11 documents

2011-12-11 15:21:52,250 INFO  solr.SolrIndexer - SolrIndexer: finished at 
2011-12-11 15:21:52, elapsed: 00:00:01

2011-12-11 15:21:52,251 INFO  solr.SolrDeleteDuplicates - SolrDeleteDuplicates: 
starting at 2011-12-11 15:21:52

2011-12-11 15:21:52,251 INFO  solr.SolrDeleteDuplicates - SolrDeleteDuplicates: 
Solr url: http://localhost:8983/solr/

2011-12-11 15:21:52,330 WARN  mapred.LocalJobRunner - job_local_0020

java.lang.NullPointerException

at org.apache.hadoop.io.Text.encode(Text.java:388)

at org.apache.hadoop.io.Text.set(Text.java:178)

at 
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:270)

at 
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:241)

at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)

at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)

at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)

at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)

at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)


--
Patrick Durusau
patr...@durusau.net
Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
OASIS Technical Advisory Board (TAB) - member

Another Word For It (blog): http://tm.durusau.net
Homepage: http://www.durusau.net
Twitter: patrickDurusau