[Biojava-l] Fwd: Java Error:- XML Parsing Error: XML or text declaration not at start of entity

Richard Holland Tue, 24 Nov 2009 07:16:03 -0800

Jitesh - I forwarded your response to the list so that everyone can get the 
chance to reply.


cheers,
Richard

Begin forwarded message:

> From: jitesh dundas <[email protected]>
> Date: 24 November 2009 14:47:00 GMT
> To: Richard Holland <[email protected]>
> Subject: Re: [Biojava-l] Java Error:- XML Parsing Error: XML or text 
> declaration not at start of entity
> 
> Dear Sir,
>  
> Thank you for your reply. I figured this problem out by sending records in 
> small sets. e.g. 20 pages per page.
>  
> It is like a pagination functionality. For each new page, we need to hit the 
> URl..
>  
> My functionality is working fine.I will be happy to share my code with you 
> (and anyone) who needs it.
>  
> I simply fetch data from the URL and write to an XML file. Next I just read 
> the XML file and show them in the web page to the user.
>  
> Again, I need to know how to fetch records for protein database. Two types of 
> searches are needed I suspect.
>  
> First we use the Esearch utility and then the Efetch utility to get the data 
> of the specific protein..
>  
> I welcome any suggestions on this !
>  
> Thank you everyone for your help.
> 
> Regards,
> Jitesh Dundas
>  
> On 11/24/09, Richard Holland <[email protected]> wrote:
> Your program takes an input 'txtURLString' - could you give an example of the 
> value that this usually contains? I suspect that this URL is where your 
> problem lies but without seeing an example value I couldn't say for sure.
> 
> thanks,
> Richard
> 
> On 8 Nov 2009, at 10:22, jitesh dundas wrote:
> 
> > Dear Sir,
> >
> > My program is working fine and can send me an xml file with 20
> > records. However, it does not allow me to send large amounts of
> > records.
> >
> > For e.g. if I enter "cancer" it will return only 20 records.
> >
> > Can you please tell me what I should do next to get all those records.
> > Thank you in advance
> >
> > Regards,
> > Jitesh Dundas
> >
> > On Sun, Nov 1, 2009 at 9:36 PM, Andreas Prlic <[email protected]> wrote:
> >>
> >> Hi Jitesh,
> >>
> >> It is hard to read your code with all the formatting off probably due to 
> >> email and many commented lines that don;t seem to get used. Can you 
> >> provide the stacktrace, so we can see what part of biojava is affected?
> >>
> >> Probably a good strategy to write and debug this is to simply the problem 
> >> into smaller steps. Try to first download the files you want to parse and 
> >> write the code to parse them from the local file.  That will avoid any 
> >> issues you might encounter with networking and server/client 
> >> communication. Once the parsing is working you could take it to the next 
> >> step and add the server communication...
> >>
> >> Andreas
> >>
> >>
> >>
> >>
> >> On Sun, Nov 1, 2009 at 7:41 AM, jitesh dundas <[email protected]> wrote:
> >>>
> >>> Hi friends,
> >>>
> >>> I am getting this error on doing a post(using the code below) to this 
> >>> url->
> >>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=cancer&reldate=10
> >>>
> >>> I have written this code in .jsp file. Later I will change it into 
> >>> servlet.
> >>>
> >>> Error:-
> >>> XML Parsing Error: XML or text declaration not at start of entity
> >>> Location:
> >>> http://localhost:8080/ProteomDb/ImportFromPubmed2.jsp?txtDbName=pubmed&txtTerm=cancer&txtreldate=10&comSDay=01&comSMonth=01&txtSYear=&comEDay=01&comEMonth=01&txtEYear=&txtURLString=http%3A%2F%2Feutils.ncbi.nlm.nih.gov%2Fentrez%2Feutils%2Fesearch.fcgi%3Fdb%3Dpubmed%26term%3Dcancer%26reldate%3D10&txtsubmit=Fetch+Data+From+NCBI
> >>> Line Number 11, Column 1:<?xml version="1.0" ?><!DOCTYPE eSearchResult
> >>> PUBLIC "-//NLM//DTD eSearchResult, 11 May 2002//EN" "
> >>> http://www.ncbi.nlm.nih.gov/entrez/query/DTD/eSearch_020511.dtd";><eSearchResult><Count>2034</Count><RetMax>20</RetMax><RetStart>0</RetStart><IdList>
> >>>   <Id>19877350</Id>        <Id>19877304</Id>        <Id>19877297</Id>
> >>>   <Id>19877284</Id>        <Id>19877271</Id>        <Id>19877265</Id>
> >>>   <Id>19877250</Id>        <Id>19877245</Id>        <Id>19877226</Id>
> >>>   <Id>19877210</Id>        <Id>19877179</Id>        <Id>19877175</Id>
> >>>   <Id>19877161</Id>        <Id>19877159</Id>        <Id>19877158</Id>
> >>>   <Id>19877123</Id>        <Id>19877122</Id>        <Id>19877120</Id>
> >>>   <Id>19877119</Id>        <Id>19877118</Id>
> >>> </IdList><TranslationSet><Translation>     <From>cancer</From>
> >>> <To>"neoplasms"[MeSH Terms] OR "neoplasms"[All Fields] OR "cancer"[All
> >>> Fields]</To>    </Translation></TranslationSet><TranslationStack>
> >>> <TermSet>    <Term>"neoplasms"[MeSH Terms]</Term>    <Field>MeSH
> >>> Terms</Field>    <Count>2082133</Count>    <Explode>Y</Explode>
> >>> </TermSet>   <TermSet>    <Term>"neoplasms"[All Fields]</Term>    
> >>> <Field>All
> >>> Fields</Field>    <Count>1634731</Count>    <Explode>Y</Explode>
> >>> </TermSet>   <OP>OR</OP>   <TermSet>    <Term>"cancer"[All Fields]</Term>
> >>> <Field>All Fields</Field>    <Count>902537</Count>    <Explode>Y</Explode>
> >>> </TermSet>   <OP>OR</OP>   <OP>GROUP</OP>   <TermSet>
> >>> <Term>2009/10/22[EDAT]</Term>    <Field>EDAT</Field>    <Count>0</Count>
> >>> <Explode>Y</Explode>   </TermSet>   <TermSet>
> >>> <Term>2009/11/01[EDAT]</Term>    <Field>EDAT</Field>    <Count>0</Count>
> >>> <Explode>Y</Explode>   </TermSet>   <OP>RANGE</OP>   <OP>AND</OP>
> >>> </TranslationStack><QueryTranslation>("neoplasms"[MeSH Terms] OR
> >>> "neoplasms"[All Fields] OR "cancer"[All Fields]) AND 2009/10/22[EDAT] :
> >>> 2009/11/01[EDAT]</QueryTranslation></eSearchResult>
> >>> ^
> >>>
> >>> As you can see, the XML output is coming fine but the above error does not
> >>> go..The output via this program should be just like hitting manually the
> >>> above URL in the browser..
> >>> The browser is Mozilla Firefox.
> >>>
> >>> Code:-
> >>>
> >>> <%@ page language = "java" %>
> >>> <%@ page import = "java.sql.*" %>
> >>> <%@ page import = "java.util.*" %>
> >>> <%@ page import = "java.io.*" %>
> >>> <%@ page import="java.lang.*" %>
> >>> <%@ page import="java.net.*" %>
> >>> <%@ page import="java.nio.*" %>
> >>> <%@ page contentType="text/xml; charset=utf-8" pageEncoding="UTF-8" %>
> >>>
> >>>
> >>> <%
> >>>
> >>> try
> >>> {
> >>>    //String str = "<?xml version='1.0' ?>";
> >>>    //out.println("<?xml version='1.0' encoding='utf-8' ?>");
> >>>
> >>>    Properties systemSettings = System.getProperties();
> >>>    systemSettings.put("http.proxyHost", "********");
> >>>    systemSettings.put("http.proxyPort", "******");
> >>>    systemSettings.put("sun.net.client.defaultConnectTimeout", "10000");
> >>>    systemSettings.put("sun.net.client.defaultReadTimeout", "10000");
> >>>
> >>>     //out.println("Properties Set");
> >>>    Authenticator.setDefault(new Authenticator()
> >>>    {
> >>>          protected PasswordAuthentication getPasswordAuthentication()
> >>>          {
> >>>                  return new PasswordAuthentication("**",
> >>> "******".toCharArray()); // specify ur user name password of iitb login
> >>>          }
> >>>    });
> >>>
> >>>
> >>>   System.setProperties(systemSettings);
> >>>   //out.println("After Authentication & Properties Settings");
> >>>
> >>>   //create xml file.
> >>>   //the input to google api
> >>>   //String textAreaContent = request.getParameter("text");
> >>>   String textAreaContent = "This si a tst";
> >>>
> >>>   String str = "<?xml version='1.0' encoding='utf-8' ?>";
> >>>
> >>>   //xml file generation ends here..
> >>>   //FetchDataFromNCBI_URLString.jsp
> >>>   String URLString = request.getParameter("txtURLString").trim();
> >>>
> >>>   //URL url = new URL("
> >>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=protein&term=BAA20519
> >>> ");
> >>>   URL url = new URL(URLString); //url string taken from user input.
> >>>   HttpURLConnection connection = null;
> >>>
> >>>   connection = (HttpURLConnection) url.openConnection();
> >>>   System.out.println("After open connection");
> >>>   connection.setRequestMethod("POST");
> >>>   connection.setDoInput(true);
> >>>   connection.setDoOutput(true);
> >>>
> >>>   connection.setUseCaches(false);
> >>>   connection.setAllowUserInteraction(false);
> >>>   //connection.setFollowRedirects(true);
> >>>   //connection.setInstanceFollowRedirects(true);
> >>>   //System.out.println("Before-------------------");
> >>>   connection.setRequestProperty ("Content-Type","text/xml;
> >>> charset=\"utf-8\"");
> >>>   //System.out.println("After-------------------");
> >>>
> >>>   //System.out.println(""+ connection.getOutputStream());
> >>>
> >>>   //System.out.println("After dataoutputstream..Line No-65");
> >>>
> >>>   //System.out.println("Response Code="+ connection.getResponseCode);
> >>>
> >>>   OutputStreamWriter dosout = new
> >>> OutputStreamWriter(connection.getOutputStream());
> >>>   //System.out.println("After dosout object..Line No-63");
> >>>   //dosout.write(str);
> >>>   dosout.close ();
> >>>
> >>>   BufferedReader in = new BufferedReader( new InputStreamReader(
> >>> connection.getInputStream()));
> >>>
> >>>   String decodedString;
> >>>   String tempstr = "";
> >>>
> >>>
> >>>   while ((decodedString = in.readLine()) != null)
> >>>   {
> >>>       tempstr = tempstr + decodedString;
> >>>       //out.println(decodedString);
> >>>   }
> >>>   out.println(tempstr);
> >>>   in.close();
> >>> }
> >>> catch(Exception ex)
> >>> {
> >>> out.println("Exception->"+ex);
> >>> PrintWriter pw = response.getWriter();
> >>> ex.printStackTrace(pw);
> >>> }
> >>>
> >>>
> >>> %>
> >>>
> >>> Thanks in advance..
> >>>
> >>> Regards,
> >>> JItesh Dundas
> >>>
> >>> _______________________________________________
> >>> Biojava-l mailing list  -  [email protected]
> >>> http://lists.open-bio.org/mailman/listinfo/biojava-l
> >>
> >>
> > <ImportFromPubmed3.jsp>_______________________________________________
> > Biojava-l mailing list  -  [email protected]
> > http://lists.open-bio.org/mailman/listinfo/biojava-l
> 
> --
> Richard Holland, BSc MBCS
> Operations and Delivery Director, Eagle Genomics Ltd
> T: +44 (0)1223 654481 ext 3 | E: [email protected]
> http://www.eaglegenomics.com/
> 
> 

--
Richard Holland, BSc MBCS
Operations and Delivery Director, Eagle Genomics Ltd
T: +44 (0)1223 654481 ext 3 | E: [email protected]
http://www.eaglegenomics.com/


_______________________________________________
Biojava-l mailing list  -  [email protected]
http://lists.open-bio.org/mailman/listinfo/biojava-l

[Biojava-l] Fwd: Java Error:- XML Parsing Error: XML or text declaration not at start of entity

Reply via email to