Amrit Sarkar wrote: >> Kevin, >> >> I am not able to replicate the issue on my system, which is bit annoying >> for me. Try this out for last time: >> >> docker exec -it --user=solr solr bin/post -c handbook >> http://quadra.franz.com:9091/index.md -recursive 10 -delay 0 -filetypes html >> >> and have Content-Type: "html" and "text/html", try with both.
With text/html I get and your command I get quadra[git:master]$ docker exec -it --user=solr solr bin/post -c handbook http://quadra.franz.com:9091/index.md -recursive 10 -delay 0 -filetypes html /docker-java-home/jre/bin/java -classpath /opt/solr/dist/solr-core-7.0.1.jar -Dauto=yes -Drecursive=10 -Ddelay=0 -Dfiletypes=html -Dc=handbook -Ddata=web org.apache.solr.util.SimplePostTool http://quadra.franz.com:9091/index.md SimplePostTool version 5.0.0 Posting web pages to Solr url http://localhost:8983/solr/handbook/update/extract Entering auto mode. Indexing pages with content-types corresponding to file endings html SimplePostTool: WARNING: Never crawl an external web site faster than every 10 seconds, your IP will probably be blocked Entering recursive mode, depth=10, delay=0s Entering crawl at level 0 (1 links total, 1 new) POSTed web resource http://quadra.franz.com:9091/index.md (depth: 0) [Fatal Error] :1:1: Content is not allowed in prolog. Exception in thread "main" java.lang.RuntimeException: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog. at org.apache.solr.util.SimplePostTool$PageFetcher.getLinksFromWebPage(SimplePostTool.java:1252) at org.apache.solr.util.SimplePostTool.webCrawl(SimplePostTool.java:616) at org.apache.solr.util.SimplePostTool.postWebPages(SimplePostTool.java:563) at org.apache.solr.util.SimplePostTool.doWebMode(SimplePostTool.java:365) at org.apache.solr.util.SimplePostTool.execute(SimplePostTool.java:187) at org.apache.solr.util.SimplePostTool.main(SimplePostTool.java:172) Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121) at org.apache.solr.util.SimplePostTool.makeDom(SimplePostTool.java:1061) at org.apache.solr.util.SimplePostTool$PageFetcher.getLinksFromWebPage(SimplePostTool.java:1232) ... 5 more When I use "-filetype md" back to the regular output that doesn't scan anything. >> >> If you get past this hurdle this hurdle, let me know. >> >> Amrit Sarkar >> Search Engineer >> Lucidworks, Inc. >> 415-589-9269 >> www.lucidworks.com >> Twitter http://twitter.com/lucidworks >> LinkedIn: https://www.linkedin.com/in/sarkaramrit2 >> >> On Fri, Oct 13, 2017 at 8:22 PM, Kevin Layer <la...@franz.com> wrote: >> >> > Amrit Sarkar wrote: >> > >> > >> ah oh, dockers. They are placed under [solr-home]/server/log/solr/log >> > in >> > >> the machine. I haven't played much with docker, any way you can get that >> > >> file from that location. >> > >> > I see these files: >> > >> > /opt/solr/server/logs/archived >> > /opt/solr/server/logs/solr_gc.log.0.current >> > /opt/solr/server/logs/solr.log >> > /opt/solr/server/solr/handbook/data/tlog >> > >> > The 3rd one has very little info. Attached: >> > >> > >> > 2017-10-11 15:28:09.564 INFO (main) [ ] o.e.j.s.Server >> > jetty-9.3.14.v20161028 >> > 2017-10-11 15:28:10.668 INFO (main) [ ] o.a.s.s.SolrDispatchFilter >> > ___ _ Welcome to Apache Solr™ version 7.0.1 >> > 2017-10-11 15:28:10.669 INFO (main) [ ] o.a.s.s.SolrDispatchFilter / >> > __| ___| |_ _ Starting in standalone mode on port 8983 >> > 2017-10-11 15:28:10.670 INFO (main) [ ] o.a.s.s.SolrDispatchFilter \__ >> > \/ _ \ | '_| Install dir: /opt/solr, Default config dir: >> > /opt/solr/server/solr/configsets/_default/conf >> > 2017-10-11 15:28:10.707 INFO (main) [ ] o.a.s.s.SolrDispatchFilter >> > |___/\___/_|_| Start time: 2017-10-11T15:28:10.674Z >> > 2017-10-11 15:28:10.747 INFO (main) [ ] o.a.s.c.SolrResourceLoader >> > Using system property solr.solr.home: /opt/solr/server/solr >> > 2017-10-11 15:28:10.763 INFO (main) [ ] o.a.s.c.SolrXmlConfig Loading >> > container configuration from /opt/solr/server/solr/solr.xml >> > 2017-10-11 15:28:11.062 INFO (main) [ ] o.a.s.c.SolrResourceLoader >> > [null] Added 0 libs to classloader, from paths: [] >> > 2017-10-11 15:28:12.514 INFO (main) [ ] o.a.s.c.CorePropertiesLocator >> > Found 0 core definitions underneath /opt/solr/server/solr >> > 2017-10-11 15:28:12.635 INFO (main) [ ] o.e.j.s.Server Started @4304ms >> > 2017-10-11 15:29:00.971 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system >> > params={wt=json} status=0 QTime=108 >> > 2017-10-11 15:29:01.080 INFO (qtp1911006827-18) [ ] >> > o.a.s.c.TransientSolrCoreCacheDefault >> > Allocating transient cache for 2147483647 transient cores >> > 2017-10-11 15:29:01.083 INFO (qtp1911006827-18) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores >> > params={core=handbook&action=STATUS&wt=json} status=0 QTime=5 >> > 2017-10-11 15:29:01.194 INFO (qtp1911006827-19) [ ] >> > o.a.s.h.a.CoreAdminOperation core create command >> > name=handbook&action=CREATE&instanceDir=handbook&wt=json >> > 2017-10-11 15:29:01.342 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.c.SolrResourceLoader [handbook] Added 51 libs to classloader, from >> > paths: [/opt/solr/contrib/clustering/lib, /opt/solr/contrib/extraction/lib, >> > /opt/solr/contrib/langid/lib, /opt/solr/contrib/velocity/lib, >> > /opt/solr/dist] >> > 2017-10-11 15:29:01.504 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.c.SolrConfig Using Lucene MatchVersion: 7.0.1 >> > 2017-10-11 15:29:01.969 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.s.IndexSchema [handbook] Schema name=default-config >> > 2017-10-11 15:29:03.678 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.s.IndexSchema Loaded schema default-config/1.6 with uniqueid field id >> > 2017-10-11 15:29:03.806 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.c.CoreContainer Creating SolrCore 'handbook' using configuration from >> > instancedir /opt/solr/server/solr/handbook, trusted=true >> > 2017-10-11 15:29:03.853 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.c.SolrCore solr.RecoveryStrategy.Builder >> > 2017-10-11 15:29:03.866 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.c.SolrCore [[handbook] ] Opening new SolrCore at >> > [/opt/solr/server/solr/handbook], dataDir=[/opt/solr/server/ >> > solr/handbook/data/] >> > 2017-10-11 15:29:04.180 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.r.XSLTResponseWriter xsltCacheLifetimeSeconds=5 >> > 2017-10-11 15:29:05.100 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.u.UpdateHandler Using UpdateLog implementation: >> > org.apache.solr.update.UpdateLog >> > 2017-10-11 15:29:05.101 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.u.UpdateLog Initializing UpdateLog: dataDir= defaultSyncLevel=FLUSH >> > numRecordsToKeep=100 maxNumLogsToKeep=10 numVersionBuckets=65536 >> > 2017-10-11 15:29:05.150 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.u.CommitTracker Hard AutoCommit: if uncommited for 15000ms; >> > 2017-10-11 15:29:05.151 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.u.CommitTracker Soft AutoCommit: disabled >> > 2017-10-11 15:29:05.199 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.s.SolrIndexSearcher Opening [Searcher@2b9fd97b[handbook] main] >> > 2017-10-11 15:29:05.229 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.r.ManagedResourceStorage File-based storage initialized to use dir: >> > /opt/solr/server/solr/handbook/conf >> > 2017-10-11 15:29:05.266 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.h.c.SpellCheckComponent Initializing spell checkers >> > 2017-10-11 15:29:05.283 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.s.DirectSolrSpellChecker init: {name=default,field=_text_, >> > classname=solr.DirectSolrSpellChecker,distanceMeasure=internal, >> > accuracy=0.5,maxEdits=2,minPrefix=1,maxInspections=5,minQueryLength=4, >> > maxQueryFrequency=0.01} >> > 2017-10-11 15:29:05.318 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.h.ReplicationHandler Commits will be reserved for 10000 >> > 2017-10-11 15:29:05.434 INFO >> > (searcherExecutor-7-thread-1-processing-x:handbook) >> > [ x:handbook] o.a.s.c.QuerySenderListener QuerySenderListener sending >> > requests to Searcher@2b9fd97b[handbook] main{ExitableDirectoryReader( >> > UninvertingDirectoryReader())} >> > 2017-10-11 15:29:05.439 INFO >> > (searcherExecutor-7-thread-1-processing-x:handbook) >> > [ x:handbook] o.a.s.c.QuerySenderListener QuerySenderListener done. >> > 2017-10-11 15:29:05.440 INFO >> > (searcherExecutor-7-thread-1-processing-x:handbook) >> > [ x:handbook] o.a.s.h.c.SpellCheckComponent Loading spell index for >> > spellchecker: default >> > 2017-10-11 15:29:05.447 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.u.UpdateLog Could not find max version in index or recent updates, >> > using new clock 1580975517016784896 >> > 2017-10-11 15:29:05.468 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores >> > params={name=handbook&action=CREATE&instanceDir=handbook&wt=json} >> > status=0 QTime=4275 >> > 2017-10-11 15:29:05.494 INFO >> > (searcherExecutor-7-thread-1-processing-x:handbook) >> > [ x:handbook] o.a.s.c.SolrCore [handbook] Registered new searcher >> > Searcher@2b9fd97b[handbook] main{ExitableDirectoryReader( >> > UninvertingDirectoryReader())} >> > 2017-10-11 15:36:24.537 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores >> > params={indexInfo=false&wt=json&_=1507736184190} status=0 QTime=1 >> > 2017-10-11 15:36:24.579 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system >> > params={wt=json&_=1507736184191} status=0 QTime=38 >> > 2017-10-11 15:36:27.810 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores >> > params={indexInfo=false&wt=json&_=1507736184190} status=0 QTime=0 >> > 2017-10-11 15:36:27.846 INFO (qtp1911006827-13) [ x:handbook] >> > o.a.s.c.S.Request [handbook] webapp=/solr path=/admin/ping >> > params={action=status&wt=json&_=1507736184191&ts=1507736184191} >> > status=503 QTime=8 >> > 2017-10-11 15:36:27.852 INFO (qtp1911006827-14) [ x:handbook] >> > o.a.s.c.S.Request [handbook] webapp=/solr path=/admin/luke >> > params={numTerms=0&show=index&wt=json&_=1507736187772} status=0 QTime=35 >> > 2017-10-11 15:36:27.866 INFO (qtp1911006827-18) [ x:handbook] >> > o.a.s.c.S.Request [handbook] webapp=/solr path=/replication >> > params={wt=json&command=details&_=1507736187773} status=0 QTime=53 >> > 2017-10-11 15:36:27.893 INFO (qtp1911006827-16) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system >> > params={wt=json&_=1507736184191} status=0 QTime=84 >> > 2017-10-11 15:36:27.894 INFO (qtp1911006827-11) [ x:handbook] >> > o.a.s.c.S.Request [handbook] webapp=/solr path=/admin/system >> > params={wt=json&_=1507736187773} status=0 QTime=64 >> > 2017-10-11 15:36:33.015 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores >> > params={indexInfo=false&wt=json&_=1507736184190} status=0 QTime=0 >> > 2017-10-11 15:36:33.033 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system >> > params={wt=json&_=1507736184191} status=0 QTime=18 >> > 2017-10-11 15:36:35.199 INFO (qtp1911006827-14) [ x:handbook] >> > o.a.s.c.S.Request [handbook] webapp=/solr path=/select >> > params={q=*:*&_=1507736184481} hits=0 status=0 QTime=54 >> > 2017-10-13 13:10:43.480 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.c.PluginBag Going to create a new requestHandler with {type = >> > requestHandler,name = /update/extract,class = solr.extraction. >> > ExtractingRequestHandler,attributes = {startup=lazy, >> > name=/update/extract, class=solr.extraction.ExtractingRequestHandler},args >> > = {defaults={lowernames=true,fmap.meta=ignored_,fmap. >> > content=_text_,df=_text_}}} >> > 2017-10-13 13:10:46.287 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.u.DirectUpdateHandler2 start commit{_version_= >> > 1581148008618131456,optimize=false,openSearcher=true,waitSearcher=true, >> > expungeDeletes=false,softCommit=false,prepareCommit=false} >> > 2017-10-13 13:10:46.288 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit. >> > 2017-10-13 13:10:46.374 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.u.DirectUpdateHandler2 end_commit_flush >> > 2017-10-13 13:10:46.375 INFO (qtp1911006827-19) [ x:handbook] >> > o.a.s.u.p.LogUpdateProcessorFactory >> > [handbook] webapp=/solr path=/update/extract params={commit=true}{commit=} >> > 0 2947 >> > 2017-10-13 13:20:09.424 INFO (qtp1911006827-11) [ x:handbook] >> > o.a.s.u.DirectUpdateHandler2 start commit{_version_= >> > 1581148599141531648,optimize=false,openSearcher=true,waitSearcher=true, >> > expungeDeletes=false,softCommit=false,prepareCommit=false} >> > 2017-10-13 13:20:09.447 INFO (qtp1911006827-11) [ x:handbook] >> > o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit. >> > 2017-10-13 13:20:09.450 INFO (qtp1911006827-11) [ x:handbook] >> > o.a.s.u.DirectUpdateHandler2 end_commit_flush >> > 2017-10-13 13:20:09.451 INFO (qtp1911006827-11) [ x:handbook] >> > o.a.s.u.p.LogUpdateProcessorFactory >> > [handbook] webapp=/solr path=/update/extract params={commit=true}{commit=} >> > 0 27 >> > 2017-10-13 13:21:29.872 INFO (qtp1911006827-17) [ x:handbook] >> > o.a.s.u.DirectUpdateHandler2 start commit{_version_= >> > 1581148683498422272,optimize=false,openSearcher=true,waitSearcher=true, >> > expungeDeletes=false,softCommit=false,prepareCommit=false} >> > 2017-10-13 13:21:29.873 INFO (qtp1911006827-17) [ x:handbook] >> > o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit. >> > 2017-10-13 13:21:29.874 INFO (qtp1911006827-17) [ x:handbook] >> > o.a.s.u.DirectUpdateHandler2 end_commit_flush >> > 2017-10-13 13:21:29.876 INFO (qtp1911006827-17) [ x:handbook] >> > o.a.s.u.p.LogUpdateProcessorFactory >> > [handbook] webapp=/solr path=/update/extract params={commit=true}{commit=} >> > 0 4 >> > 2017-10-13 14:12:16.157 INFO (qtp1911006827-15) [ x:handbook] >> > o.a.s.u.DirectUpdateHandler2 start commit{_version_= >> > 1581151877759762432,optimize=false,openSearcher=true,waitSearcher=true, >> > expungeDeletes=false,softCommit=false,prepareCommit=false} >> > 2017-10-13 14:12:16.158 INFO (qtp1911006827-15) [ x:handbook] >> > o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit. >> > 2017-10-13 14:12:16.161 INFO (qtp1911006827-15) [ x:handbook] >> > o.a.s.u.DirectUpdateHandler2 end_commit_flush >> > 2017-10-13 14:12:16.162 INFO (qtp1911006827-15) [ x:handbook] >> > o.a.s.u.p.LogUpdateProcessorFactory >> > [handbook] webapp=/solr path=/update/extract params={commit=true}{commit=} >> > 0 6 >> > 2017-10-13 14:34:13.809 INFO (qtp1911006827-17) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores >> > params={indexInfo=false&wt=json&_=1507905253481} status=0 QTime=42 >> > 2017-10-13 14:34:14.006 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system >> > params={wt=json&_=1507905253483} status=0 QTime=239 >> > 2017-10-13 14:34:14.063 INFO (qtp1911006827-18) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system >> > params={wt=json&_=1507905253483} status=0 QTime=28 >> > 2017-10-13 14:34:17.720 INFO (qtp1911006827-15) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores >> > params={indexInfo=false&wt=json&_=1507905253481} status=0 QTime=0 >> > 2017-10-13 14:34:17.767 INFO (qtp1911006827-15) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=43 >> > 2017-10-13 14:34:17.773 INFO (qtp1911006827-17) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system >> > params={wt=json&_=1507905253483} status=0 QTime=54 >> > 2017-10-13 14:34:27.726 INFO (qtp1911006827-11) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:34:37.719 INFO (qtp1911006827-19) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:34:41.174 INFO (qtp1911006827-18) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores >> > params={indexInfo=false&wt=json&_=1507905253481} status=0 QTime=0 >> > 2017-10-13 14:34:41.222 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system >> > params={wt=json&_=1507905253483} status=0 QTime=48 >> > 2017-10-13 14:34:41.287 INFO (qtp1911006827-18) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system >> > params={wt=json&_=1507905253483} status=0 QTime=17 >> > 2017-10-13 14:34:42.737 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores >> > params={indexInfo=false&wt=json&_=1507905253481} status=0 QTime=0 >> > 2017-10-13 14:34:42.745 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:34:42.763 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system >> > params={wt=json&_=1507905253483} status=0 QTime=25 >> > 2017-10-13 14:34:52.980 INFO (qtp1911006827-15) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:35:02.976 INFO (qtp1911006827-11) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:35:12.976 INFO (qtp1911006827-19) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:35:22.977 INFO (qtp1911006827-16) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:35:32.981 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:35:42.986 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:35:52.986 INFO (qtp1911006827-17) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:36:02.988 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:36:12.994 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:36:22.994 INFO (qtp1911006827-15) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:36:33.002 INFO (qtp1911006827-11) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:36:43.010 INFO (qtp1911006827-16) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:36:52.995 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:37:02.997 INFO (qtp1911006827-17) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:37:13.002 INFO (qtp1911006827-17) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:37:23.014 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:37:24.960 INFO (qtp1911006827-15) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores >> > params={indexInfo=false&wt=json&_=1507905253481} status=0 QTime=0 >> > 2017-10-13 14:37:25.004 INFO (qtp1911006827-11) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system >> > params={wt=json&_=1507905253483} status=0 QTime=19 >> > 2017-10-13 14:37:25.112 INFO (qtp1911006827-18) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696} status=0 QTime=76 >> > 2017-10-13 14:38:07.403 INFO (qtp1911006827-19) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores >> > params={indexInfo=false&wt=json&_=1507905253481} status=0 QTime=0 >> > 2017-10-13 14:38:07.440 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:38:07.451 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system >> > params={wt=json&_=1507905253483} status=0 QTime=18 >> > 2017-10-13 14:38:17.391 INFO (qtp1911006827-18) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:38:27.393 INFO (qtp1911006827-16) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:38:37.403 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:38:47.395 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:38:57.399 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:39:07.400 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:39:17.404 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:39:27.406 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:39:37.408 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:39:47.415 INFO (qtp1911006827-17) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:39:57.416 INFO (qtp1911006827-16) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:40:07.431 INFO (qtp1911006827-19) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:40:17.421 INFO (qtp1911006827-15) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:40:27.421 INFO (qtp1911006827-11) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:40:37.422 INFO (qtp1911006827-11) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:40:47.422 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:40:57.428 INFO (qtp1911006827-18) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:41:07.431 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:41:17.422 INFO (qtp1911006827-17) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:41:27.423 INFO (qtp1911006827-17) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:41:37.423 INFO (qtp1911006827-19) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:41:47.426 INFO (qtp1911006827-15) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:41:57.441 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:42:07.434 INFO (qtp1911006827-11) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:42:17.434 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:42:27.435 INFO (qtp1911006827-18) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:42:37.439 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:42:47.697 INFO (qtp1911006827-16) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:42:57.804 INFO (qtp1911006827-17) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:43:08.323 INFO (qtp1911006827-19) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:43:18.653 INFO (qtp1911006827-15) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:43:28.813 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:43:38.816 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:43:48.815 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:43:58.817 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:44:08.813 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:44:18.820 INFO (qtp1911006827-16) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:44:28.818 INFO (qtp1911006827-17) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:44:38.821 INFO (qtp1911006827-19) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:44:48.823 INFO (qtp1911006827-15) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:44:58.819 INFO (qtp1911006827-11) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:45:08.824 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:45:18.820 INFO (qtp1911006827-18) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:45:28.824 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:45:38.823 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:45:48.824 INFO (qtp1911006827-16) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:45:58.819 INFO (qtp1911006827-17) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:46:08.822 INFO (qtp1911006827-17) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:46:18.820 INFO (qtp1911006827-17) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:46:28.820 INFO (qtp1911006827-11) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:46:38.826 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:46:48.823 INFO (qtp1911006827-18) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:46:58.825 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:47:08.827 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:47:18.846 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:47:28.825 INFO (qtp1911006827-19) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:47:38.826 INFO (qtp1911006827-15) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:47:50.183 INFO (qtp1911006827-17) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=1356 >> > 2017-10-13 14:47:58.828 INFO (qtp1911006827-11) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:48:08.828 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:48:18.885 INFO (qtp1911006827-18) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:48:28.827 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:48:38.831 INFO (qtp1911006827-16) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:48:48.833 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:48:58.833 INFO (qtp1911006827-13) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:49:08.834 INFO (qtp1911006827-15) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:49:18.832 INFO (qtp1911006827-17) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:49:28.835 INFO (qtp1911006827-11) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:49:38.861 INFO (qtp1911006827-14) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=14 >> > 2017-10-13 14:49:48.853 INFO (qtp1911006827-18) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:49:58.837 INFO (qtp1911006827-20) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > 2017-10-13 14:50:08.833 INFO (qtp1911006827-16) [ ] >> > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/logging >> > params={wt=json&_=1507905257696&since=0} status=0 QTime=0 >> > >> > >> > >> > >> > >> >> > >> Amrit Sarkar >> > >> Search Engineer >> > >> Lucidworks, Inc. >> > >> 415-589-9269 >> > >> www.lucidworks.com >> > >> Twitter http://twitter.com/lucidworks >> > >> LinkedIn: https://www.linkedin.com/in/sarkaramrit2 >> > >> >> > >> On Fri, Oct 13, 2017 at 8:08 PM, Kevin Layer <la...@franz.com> wrote: >> > >> >> > >> > Amrit Sarkar wrote: >> > >> > >> > >> > >> Hi Kevin, >> > >> > >> >> > >> > >> Can you post the solr log in the mail thread. I don't think it >> > handled >> > >> > the >> > >> > >> .md by itself by first glance at code. >> > >> > >> > >> > How do I extract the log you want? >> > >> > >> > >> > >> > >> > >> >> > >> > >> Amrit Sarkar >> > >> > >> Search Engineer >> > >> > >> Lucidworks, Inc. >> > >> > >> 415-589-9269 >> > >> > >> www.lucidworks.com >> > >> > >> Twitter http://twitter.com/lucidworks >> > >> > >> LinkedIn: https://www.linkedin.com/in/sarkaramrit2 >> > >> > >> >> > >> > >> On Fri, Oct 13, 2017 at 7:42 PM, Kevin Layer <la...@franz.com> >> > wrote: >> > >> > >> >> > >> > >> > Amrit Sarkar wrote: >> > >> > >> > >> > >> > >> > >> Kevin, >> > >> > >> > >> >> > >> > >> > >> Just put "html" too and give it a shot. These are the types >> > it is >> > >> > >> > expecting: >> > >> > >> > >> > >> > >> > Same thing. >> > >> > >> > >> > >> > >> > >> >> > >> > >> > >> mimeMap = new HashMap<>(); >> > >> > >> > >> mimeMap.put("xml", "application/xml"); >> > >> > >> > >> mimeMap.put("csv", "text/csv"); >> > >> > >> > >> mimeMap.put("json", "application/json"); >> > >> > >> > >> mimeMap.put("jsonl", "application/json"); >> > >> > >> > >> mimeMap.put("pdf", "application/pdf"); >> > >> > >> > >> mimeMap.put("rtf", "text/rtf"); >> > >> > >> > >> mimeMap.put("html", "text/html"); >> > >> > >> > >> mimeMap.put("htm", "text/html"); >> > >> > >> > >> mimeMap.put("doc", "application/msword"); >> > >> > >> > >> mimeMap.put("docx", >> > >> > >> > >> "application/vnd.openxmlformats-officedocument. >> > >> > >> > wordprocessingml.document"); >> > >> > >> > >> mimeMap.put("ppt", "application/vnd.ms-powerpoint"); >> > >> > >> > >> mimeMap.put("pptx", >> > >> > >> > >> "application/vnd.openxmlformats-officedocument. >> > >> > >> > presentationml.presentation"); >> > >> > >> > >> mimeMap.put("xls", "application/vnd.ms-excel"); >> > >> > >> > >> mimeMap.put("xlsx", >> > >> > >> > >> "application/vnd.openxmlformats-officedocument. >> > >> > spreadsheetml.sheet"); >> > >> > >> > >> mimeMap.put("odt", "application/vnd.oasis. >> > opendocument.text"); >> > >> > >> > >> mimeMap.put("ott", "application/vnd.oasis. >> > opendocument.text"); >> > >> > >> > >> mimeMap.put("odp", "application/vnd.oasis. >> > >> > opendocument.presentation"); >> > >> > >> > >> mimeMap.put("otp", "application/vnd.oasis. >> > >> > opendocument.presentation"); >> > >> > >> > >> mimeMap.put("ods", "application/vnd.oasis. >> > >> > opendocument.spreadsheet"); >> > >> > >> > >> mimeMap.put("ots", "application/vnd.oasis. >> > >> > opendocument.spreadsheet"); >> > >> > >> > >> mimeMap.put("txt", "text/plain"); >> > >> > >> > >> mimeMap.put("log", "text/plain"); >> > >> > >> > >> >> > >> > >> > >> The keys are the types supported. >> > >> > >> > >> >> > >> > >> > >> >> > >> > >> > >> Amrit Sarkar >> > >> > >> > >> Search Engineer >> > >> > >> > >> Lucidworks, Inc. >> > >> > >> > >> 415-589-9269 >> > >> > >> > >> www.lucidworks.com >> > >> > >> > >> Twitter http://twitter.com/lucidworks >> > >> > >> > >> LinkedIn: https://www.linkedin.com/in/sarkaramrit2 >> > >> > >> > >> >> > >> > >> > >> On Fri, Oct 13, 2017 at 6:56 PM, Amrit Sarkar < >> > >> > sarkaramr...@gmail.com> >> > >> > >> > >> wrote: >> > >> > >> > >> >> > >> > >> > >> > Ah! >> > >> > >> > >> > >> > >> > >> > >> > Only supported type is: text/html; encoding=utf-8 >> > >> > >> > >> > >> > >> > >> > >> > I am not confident of this either :) but this should work. >> > >> > >> > >> > >> > >> > >> > >> > See the code-snippet below: >> > >> > >> > >> > >> > >> > >> > >> > ...... >> > >> > >> > >> > >> > >> > >> > >> > if(res.httpStatus == 200) { >> > >> > >> > >> > // Raw content type of form "text/html; encoding=utf-8" >> > >> > >> > >> > String rawContentType = conn.getContentType(); >> > >> > >> > >> > String type = rawContentType.split(";")[0]; >> > >> > >> > >> > if(typeSupported(type) || "*".equals(fileTypes)) { >> > >> > >> > >> > String encoding = conn.getContentEncoding(); >> > >> > >> > >> > >> > >> > >> > >> > .... >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > Amrit Sarkar >> > >> > >> > >> > Search Engineer >> > >> > >> > >> > Lucidworks, Inc. >> > >> > >> > >> > 415-589-9269 >> > >> > >> > >> > www.lucidworks.com >> > >> > >> > >> > Twitter http://twitter.com/lucidworks >> > >> > >> > >> > LinkedIn: https://www.linkedin.com/in/sarkaramrit2 >> > >> > >> > >> > >> > >> > >> > >> > On Fri, Oct 13, 2017 at 6:51 PM, Kevin Layer < >> > la...@franz.com> >> > >> > wrote: >> > >> > >> > >> > >> > >> > >> > >> >> Amrit Sarkar wrote: >> > >> > >> > >> >> >> > >> > >> > >> >> >> Strange, >> > >> > >> > >> >> >> >> > >> > >> > >> >> >> Can you add: "text/html;charset=utf-8". This is >> > >> > wiki.apache.org >> > >> > >> > page's >> > >> > >> > >> >> >> Content-Type. Let's see what it says now. >> > >> > >> > >> >> >> > >> > >> > >> >> Same thing. Verified Content-Type: >> > >> > >> > >> >> >> > >> > >> > >> >> quadra[git:master]$ wget -S -O /dev/null >> > >> > http://quadra:9091/index.md >> > >> > >> > |& >> > >> > >> > >> >> grep Content-Type >> > >> > >> > >> >> Content-Type: text/html;charset=utf-8 >> > >> > >> > >> >> quadra[git:master]$ ] >> > >> > >> > >> >> >> > >> > >> > >> >> quadra[git:master]$ docker exec -it --user=solr solr >> > bin/post -c >> > >> > >> > handbook >> > >> > >> > >> >> http://quadra:9091/index.md -recursive 10 -delay 0 >> > -filetypes >> > >> > md >> > >> > >> > >> >> /docker-java-home/jre/bin/java -classpath >> > >> > >> > /opt/solr/dist/solr-core-7.0.1.jar >> > >> > >> > >> >> -Dauto=yes -Drecursive=10 -Ddelay=0 -Dfiletypes=md >> > -Dc=handbook >> > >> > >> > -Ddata=web >> > >> > >> > >> >> org.apache.solr.util.SimplePostTool >> > http://quadra:9091/index.md >> > >> > >> > >> >> SimplePostTool version 5.0.0 >> > >> > >> > >> >> Posting web pages to Solr url >> > http://localhost:8983/solr/han >> > >> > >> > >> >> dbook/update/extract >> > >> > >> > >> >> Entering auto mode. Indexing pages with content-types >> > >> > corresponding >> > >> > >> > to >> > >> > >> > >> >> file endings md >> > >> > >> > >> >> SimplePostTool: WARNING: Never crawl an external web site >> > >> > faster than >> > >> > >> > >> >> every 10 seconds, your IP will probably be blocked >> > >> > >> > >> >> Entering recursive mode, depth=10, delay=0s >> > >> > >> > >> >> Entering crawl at level 0 (1 links total, 1 new) >> > >> > >> > >> >> SimplePostTool: WARNING: Skipping URL with unsupported type >> > >> > text/html >> > >> > >> > >> >> SimplePostTool: WARNING: The URL >> > http://quadra:9091/index.md >> > >> > >> > returned a >> > >> > >> > >> >> HTTP result status of 415 >> > >> > >> > >> >> 0 web pages indexed. >> > >> > >> > >> >> COMMITting Solr index changes to >> > http://localhost:8983/solr/han >> > >> > >> > >> >> dbook/update/extract... >> > >> > >> > >> >> Time spent: 0:00:00.531 >> > >> > >> > >> >> quadra[git:master]$ >> > >> > >> > >> >> >> > >> > >> > >> >> Kevin >> > >> > >> > >> >> >> > >> > >> > >> >> >> >> > >> > >> > >> >> >> Amrit Sarkar >> > >> > >> > >> >> >> Search Engineer >> > >> > >> > >> >> >> Lucidworks, Inc. >> > >> > >> > >> >> >> 415-589-9269 >> > >> > >> > >> >> >> www.lucidworks.com >> > >> > >> > >> >> >> Twitter http://twitter.com/lucidworks >> > >> > >> > >> >> >> LinkedIn: https://www.linkedin.com/in/sarkaramrit2 >> > >> > >> > >> >> >> >> > >> > >> > >> >> >> On Fri, Oct 13, 2017 at 6:44 PM, Kevin Layer < >> > >> > la...@franz.com> >> > >> > >> > wrote: >> > >> > >> > >> >> >> >> > >> > >> > >> >> >> > OK, so I hacked markserv to add Content-Type >> > text/html, >> > >> > but now >> > >> > >> > I get >> > >> > >> > >> >> >> > >> > >> > >> > >> >> >> > SimplePostTool: WARNING: Skipping URL with >> > unsupported type >> > >> > >> > text/html >> > >> > >> > >> >> >> > >> > >> > >> > >> >> >> > What is it expecting? >> > >> > >> > >> >> >> > >> > >> > >> > >> >> >> > $ docker exec -it --user=solr solr bin/post -c >> > handbook >> > >> > >> > >> >> >> > http://quadra:9091/index.md -recursive 10 -delay 0 >> > >> > -filetypes >> > >> > >> > md >> > >> > >> > >> >> >> > /docker-java-home/jre/bin/java -classpath >> > >> > >> > >> >> /opt/solr/dist/solr-core-7.0.1.jar >> > >> > >> > >> >> >> > -Dauto=yes -Drecursive=10 -Ddelay=0 -Dfiletypes=md >> > >> > -Dc=handbook >> > >> > >> > >> >> -Ddata=web >> > >> > >> > >> >> >> > org.apache.solr.util.SimplePostTool >> > >> > http://quadra:9091/index.md >> > >> > >> > >> >> >> > SimplePostTool version 5.0.0 >> > >> > >> > >> >> >> > Posting web pages to Solr url >> > http://localhost:8983/solr/ >> > >> > >> > >> >> >> > handbook/update/extract >> > >> > >> > >> >> >> > Entering auto mode. Indexing pages with content-types >> > >> > >> > corresponding >> > >> > >> > >> >> to >> > >> > >> > >> >> >> > file endings md >> > >> > >> > >> >> >> > SimplePostTool: WARNING: Never crawl an external web >> > site >> > >> > >> > faster than >> > >> > >> > >> >> >> > every 10 seconds, your IP will probably be blocked >> > >> > >> > >> >> >> > Entering recursive mode, depth=10, delay=0s >> > >> > >> > >> >> >> > Entering crawl at level 0 (1 links total, 1 new) >> > >> > >> > >> >> >> > SimplePostTool: WARNING: Skipping URL with >> > unsupported type >> > >> > >> > text/html >> > >> > >> > >> >> >> > SimplePostTool: WARNING: The URL >> > >> > http://quadra:9091/index.md >> > >> > >> > >> >> returned a >> > >> > >> > >> >> >> > HTTP result status of 415 >> > >> > >> > >> >> >> > 0 web pages indexed. >> > >> > >> > >> >> >> > COMMITting Solr index changes to >> > >> > http://localhost:8983/solr/ >> > >> > >> > >> >> >> > handbook/update/extract... >> > >> > >> > >> >> >> > Time spent: 0:00:03.882 >> > >> > >> > >> >> >> > $ >> > >> > >> > >> >> >> > >> > >> > >> > >> >> >> > Thanks. >> > >> > >> > >> >> >> > >> > >> > >> > >> >> >> > Kevin >> > >> > >> > >> >> >> > >> > >> > >> > >> >> >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >