Hello David! Your project seems to be very interesting, could you elaborate a bit more?
So much thank you! I will definitely be happy to elaborate more on it via a skype call: I could share the screen and show what I m boiling in the pot :D Back to your reply now: Yes, I was mainly testing during time both Europe and USA are connected. However, I am experiencing this type of delay from my laptop; maybe on deployment will speed up cause is my home network creepy? I am concerned because I need to first fetch results from Wikipedia, then elaborate with my own data (that is fast enough <200ms) and then push it to the client. That is the reason of why I will put it server side and not client-side. I need search generator only as *first entry point*: imagine you need to search for a topic, but you don't know exactly what. Imagine an input form, you type in some keywords, select one among results, and then you start your session. I cannot estimate exactly the amount of FST query I need; let's say each user will need a search generator only once per session. Maybe 30 user per seconds concurrent would be a good reference (it 's same number Parse of Facebook provide, Firebase up to 100... so maybe I could relay on similar order of magnitude...) If I can provide people with a smooth user experience on search, that will be interesting because I could free resources up : I may extend a test of knowledge discovery to other languages, too. If the first user experience was too slow (~1.3s + bandwith transmission ~1.5+ per query) that could become critical. I don't need search generator to operate in batch, or to track changes. It just serve the user to find a topic as entry point for discovery. I cannot use 'Opensearch' because it does not provide _IDs ; also, it searches against titles only. Would it be possible to reserve somehow bandwith or requests for a domain? On Wed, Dec 23, 2015 at 3:55 PM, David Causse <[email protected]> wrote: > Le 22/12/2015 18:28, Luigi Assom a écrit : > >> I tested it from my laptop, and I found it quite slow; as example, it >> took: >> >> ~1.2 seconds for querying 'DNA' >> >> ~1.6 s for 'terroristi attacks' >> >> ~1.7s for 'biology technology' >> >> > For a single word query on english wikipedia this is more like 400ms for > me, so I'm not sure to understand why you experienced such response times. > Response times may vary depending on server load but I'm surprised you > noticed more than 1 sec for simple queries like that. > Did you check that you are receiving the result type/format you expect > (i.e. format=json ) ? > Could you re-check at different times of the day, servers may be busy > around 8pm CET (time when both europe and america are active). > > Your project seems to be very interesting, could you elaborate a bit more? > Do you plan to use the api from a backend/automata which will need to send > a lot of queries, do you have an estimation on your needs (number of > queries and refresh rate)? > If your process is like refreshing a set of queries regularly I'd suggest > you build a daemon that send few queries (3 or 4) per minute rather than an > aggressive batch with parallel processes run once a day/week/month. > You should have a look at RCStream[1] which may be more appropriate to > your needs (if you plan to track changes it's definitely better than > refreshing the same set of queries regularly) > > Thank you! > > [1] https://wikitech.wikimedia.org/wiki/RCStream > > _______________________________________________ > discovery mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/discovery >
_______________________________________________ discovery mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/discovery
