On 24 March 2012 10:44, Thufir <[email protected]> wrote: > What's the correct way to get an article body? > > I'm using java.util.logging.Logger to catch > org.apache.commons.net.MalformedServerReplyException to a log file: > > 15 <record> > 16 <date>2012-03-24T03:09:35</date> > 17 <millis>1332583775299</millis> > 18 <sequence>1</sequence> > 19 <logger>gwene.LogUtils</logger> > 20 <level>INFO</level> > 21 <class>gwene.LogUtils</class> > 22 <method>logArticles</method> > 23 <thread>1</thread> > 24 <message>Could not parse response code. > 25 Server Reply: <p>Alex &#8220;Hurricane&#8221; Higgins, > transformer of snooker, died on July 24th, aged > ...text snipped... > mercilessly, one by one. ...</p><div > class="feedflare"></message> > 26 </record> > > > The server reply is *exactly* what I'm missing, the content of the article. > code and full output: > > https://gist.github.com/2180843 > > I'm guessing that the HTML is throwing things off? What does > NNTPClient.retrieveArticleBody expect? After all, anything can be in an > NNTP post.
NNTP was defined in http://tools.ietf.org/html/rfc977 See section 3.1.3 which shows that the body content must be preceeded by a status reply. That appears to be missing in the response from the server. > Now, what I'm really after, I suppose, is the server reply because that has > the body of the NNTP article. However, surely, that's not the way to use > org.apache.commons.net.nntp.NNTPClient, only I can't find the correct way. > Hence this kludge to grab the MalformedServerReply instead of parsing it. > > I suppose it's possible to log everything, and then parse the log file, but > that seems like a very complex way of doing a simple thing. > > The API documentation for NNTPClient assumes a knowledge of NNTP which, > unfortunately, I don't have. I've looked through the example code and don't > see any samples where article bodies are parsed. The closest I see is > NNTPClient.retrieveArticleBody: > > https://commons.apache.org/net/api-3.1/org/apache/commons/net/nntp/NNTPClient.html#retrieveArticleBody%28java.lang.String%29 > > however, that's just malformed content. Presumably, since Pan can connect > with gmane fine, that's not the problem. Also, by looking in the Pan > newsreader, NNTPClient.retrieveArticleBody results match with what I'm after > -- namely, the body of the article. > > What is the correct way to grab the article body? I've looked through the > API quite thoroughly. > > Surely there must be an example for parsing the article body, not just the > header. Or, at least, using BufferedReader to get the article body and > assign it to a String. If so, I don't see a better method available through > the API. Have a look at the examples in: http://commons.apache.org/net/examples/nntp/ > > > thanks, > > Thufir > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
