Interesting points about the listen setting... but great to know it's now working for you.
As for size of index data... it's tricky. Attributes are relatively easy - integers, booleans, timestamps are bytes, but string attributes vary in size, and then fields get even more complicated again, depending on infix/prefix settings and morphology settings. There's a few tips here that may help clarify things: http://support.flying-sphinx.com/kb/managing/reducing-index-data-size -- Pat On 15 Apr 2014, at 1:12 am, Joram Okwaro <[email protected]> wrote: > Hi Pat, > > Got it working. The only thing I did differently was add the listen setting > to the thinking_sphinx.yml file of the dummy app in the sphinx/search server. > I had trouble connecting to my search server whereby I kept on getting a > ThinkingSphinx::ConnectionError with the message Can't connect to MySQL > server on '22.22.22.222' (111). This took me a while to figure out but I > finally did. Though according to my rules on Amazon EC2 I should have been > able to connect to port 9306, I still couldn't. At first, the listen setting > wasn't there and by default it resulted in creating the listen setting in the > resultant production.sphinx.conf(my Sphinx configuration file) as > 0.0.0.0:9306:mysql41. After scouring the interwebs for other people's conf > files I realized that in order for searchd to run on port 9306, the conf file > had to have 9306:mysql41 instead > (http://stackoverflow.com/questions/11374558/sphinx-search-mysql-client-on-production-server). > I hardcoded this in my thinking_sphinx.yml file as shown above and it did > the trick. > > Thanks for the constant help sir. I wouldn't have set this up without your > help. One last question.. is there any way for me to estimate the size of my > indices before I generate them? I have a massive table that is throwing an > error at some point because of what I'm assuming is low disk space: > > ** [out :: 54.247.96.253] ERROR: index 'external_financial_event_core': > raw_hits: write error: 1001335 of 1048451 bytes written. > > Cheers, > Joram. > > On Thursday, April 3, 2014 12:51:43 AM UTC+3, Pat Allan wrote: > Hi Joram > > Appreciate all the detail - it looks like you're on track, I think. The > 'address' setting should indeed be the IP address of the Sphinx server (and > you'll want that in your production settings of thinking_sphinx.yml on both > machines). > > As for the port - via the mysql41 setting - this is for Sphinx. MySQL itself > is not used at all (as your database is PostgreSQL), it's just that Sphinx > communicates as if it were a MySQL server (it uses the MySQL server > protocol), hence the name of this setting, and it's also why the mysql2 gem > is required. You don't need a MySQL database or a MySQL server (so, you > should probably close port 3306). > > Cheers > > -- > Pat > > On 3 Apr 2014, at 6:29 am, Joram Walekhwa Okwaro <[email protected]> wrote: > >> Hi Pat, >> >> There seems to be a disconnect somewhere in the way I have understood how >> the setup works. I have nailed down Indexing and Searching is the issue now. >> Apologies for any repetition on my side too but I think this will help you >> guide me better. I will explain what I've done so far and ask questions >> along the way. >> >> Sphinx/Indexing server >> >> I've set up the 'dumb' app on this server. The thinking_sphinx.yml file >> looks like this: >> >> development: >> pid_file: "/var/run/sphinx/searchd.pid" >> indices_location: "/home/shared/db/sphinx" >> configuration_file: "/home/shared/development.sphinx.conf" >> sql_sock: /var/run/mysqld/mysqld.sock >> searchd_log: "/home/log/production.searchd.log" >> query_log: "/home/log/production.query.log" >> #mem_limit: 128M >> morphology: stem_en >> min_infix_len: 3 >> enable_star: true >> production: >> #pid_file: "/var/run/sphinx/searchd.pid" >> indices_location: "/home/ubuntu/projects/shared/db/sphinx" >> configuration_file: "/home/ubuntu/projects/shared/production.sphinx.conf" >> #sql_sock: /var/run/mysqld/mysqld.sock >> searchd_log: "/home/ubuntu/projects/kopo-kopo/log/production.searchd.log" >> query_log: "/home/ubuntu/projects/kopo-kopo/log/production.query.log" >> #mem_limit: 128M >> morphology: stem_en >> min_infix_len: 3 >> enable_star: true >> >> The database.yml file is set up to connect to the production rails server >> and hence index the production db: >> >> production: >> adapter: postgresql >> encoding: unicode >> database: app_production >> pool: 5 >> username: joram >> password: password >> host: 11.11.11.111 >> port: 5432 >> >> I've installed Sphinx and MySQL on this server and run the rebuild rake task >> which has created my conf files successfully. I've also opened up port 5432 >> (Postgres port) on my production server to accept remote connections from my >> Sphinx server and this works well i.e Indexing is working like a charm. >> >> Production server >> >> On my production Rails server, I have put my Sphinx server's address in the >> thinking_sphinx.yml file as shown below: >> >> development: >> morphology: stem_en >> min_infix_len: 3 >> enable_star: true >> production: >> address: 22.22.22.222 >> >> I'm assuming this means that Thinking Sphinx will connect to the default >> port 9306 on my Sphinx server (22.22.22.222). My confusion comes in here. Is >> the mysql41 port on my production server's configs (9306 by default in this >> case) supposed to be the MySQL port on my Sphinx server or the searchd port? >> Ive done a netstat on my Sphinx server and MySQL is running port 3306 >> (default I think) and searchd obviously on 9306. >> >> I had earlier put mysql41 in the production thinking_sphinx.yml file above >> as the mysql port on my sphinx server and that is why I was getting a >> connection error since I was trying to connect to MySQL with no password. >> >> So that's the BIG question of the day Pat. What am I missing in order to >> connect my production server to the Sphinx server so that it can use those >> generated indices to search.. You're doing a great job being patient with us >> noobs trying to figure things out. I hope I don't cause you patience to run >> out :) >> >> Thanks, >> Joram. >> >> >> >> On Wed, Apr 2, 2014 at 2:35 AM, Pat Allan <[email protected]> wrote: >> Hi Joram >> >> Apologies if this is covering stuff you already knew: >> >> * In config/thinking_sphinx.yml for your production environment you'll want >> to set mysql41 to the port you prefer (or don't set it, and Sphinx will run >> on port 9306), and you'll need to set address to the server Sphinx is >> running on. >> >> * In config/database.yml for your production environment you'll want to >> include your database connection settings. >> >> Both of these files should be present on each server your app code is on - >> they're required on your app server so it knows how to talk to both the >> database and Sphinx, and they're required on your Sphinx server so Sphinx >> can bind itself to the appropriate address, and it can talk to the database >> using the correct credentials when indexing. >> >> Hope this helps - do let me know if you've got further questions. >> >> Cheers >> >> -- >> Pat >> >> On 2 Apr 2014, at 1:37 am, Joram Walekhwa Okwaro <[email protected]> wrote: >> >>> Hi Pat, >>> >>> Thanks for the reply. You mentioned this: "...and also set the address of >>> the Sphinx server (so it binds to that address instead of 127.0.0.1, which >>> is the default).". You mean the app server right? The app server is >>> connecting to a remote instance of searchd through MySQL so the address of >>> the app server is set in MySQL configs right? >>> >>> Secondly, I have opened up MySQL to accept remote connections from my app >>> server, but even before I tried it, I was wondering how Thinking Sphinx on >>> my app server will be able to connect to MySQL without the details of the >>> user I created and granted db privileges let alone the password. Is there >>> an option to declare this in thinking_sphinx.yml? Unless I've got this part >>> all wrong. This indeed ended up being a problem since I still don't have >>> access. I get thinkingSphinx:SphinxError with the message "Access denied >>> for user 'ubuntu'@'ip-77-77-77-77.eu-west-1.compute.internal' (using >>> password: NO)" >>> >>> Would appreciate some help. Thanks. >>> >>> Joram. >>> >>> >>> >>> >>> >>> On Tue, Apr 1, 2014 at 2:08 AM, Pat Allan <[email protected]> wrote: >>> Hi Joram >>> >>> Yes, the mysql41 port is how Sphinx can be connected to. You'll need to >>> make sure that's set and opened up to the world, and also set the address >>> of the Sphinx server (so it binds to that address instead of 127.0.0.1, >>> which is the default). >>> >>> There was a bug with the address setting discovered recently - it's in the >>> Riddle gem, but you can get the latest by using the following in your >>> Gemfile: >>> >>> gem 'riddle', '~> 1.5.10', >>> :git => 'git://github.com/pat/riddle.git', >>> :branch => 'develop', >>> :ref => '0dfe38063c' >>> >>> Cheers >>> >>> -- >>> Pat >>> >>> On 1 Apr 2014, at 12:47 am, Joram Okwaro <[email protected]> wrote: >>> >>> > Hi Pat, >>> > >>> > So we still opted for a remote sphinx server. I was able to set up the >>> > server and connect to the production database remotely. I can therefor >>> > index the production db and thus generate the indices in the sphinx >>> > server. Thanks for the help once again. I am now faced with another 'big >>> > picture' issue. >>> > >>> > 1. So now that my app server will be sending search queries to searchd on >>> > the sphinx server, I'm guessing I need to open up the port on which >>> > Sphinx runs on the sphinx server? There's a mysqld41 setting that I have >>> > set to the mysql port on my sphinx server. This I'm assuming is the port >>> > that I need to open to get to searchd? Is this all that's needed in this >>> > server as far as configuration is concerned? >>> > >>> > Maybe my big picture looking at it from the app server is all wrong :) >>> > >>> > On Tuesday, March 18, 2014 3:31:01 PM UTC+3, Pat Allan wrote: >>> > It really depends on how many records (and how much data per record) >>> > you're indexing... Sphinx is generally pretty well-behaved, but I guess >>> > it depends on how limited the resources are on your app server. Whenever >>> > indexing happens, it will mean there's plenty of traffic between the >>> > indexer and your database, so having them share a machine is not a bad >>> > idea (instead of adding extra external network traffic). >>> > >>> > On 18 Mar 2014, at 5:57 pm, Joram Okwaro <[email protected]> wrote: >>> > >>> >> Hi Pat, >>> >> >>> >> Thanks for the quick responses on both threads. I took on this task from >>> >> a colleague of mine so I don't know yet how much research he did on >>> >> Sphinx performance. That was the main reason why we opted for a remote >>> >> server. The idea was that Sphinx was too resource-heavy and therefore a >>> >> risk for our app server which we can't afford to be slow. I would >>> >> appreciate your 2 cents on this. Otherwise, thank you once again for >>> >> your help. You have helped a great deal. I'll let you know if I >>> >> encounter any specific issues. >>> >> >>> >> Thanks! >>> >> >>> >> On Tuesday, March 18, 2014 9:18:15 AM UTC+3, Pat Allan wrote: >>> >> Didn't quite cover this in the other thread. >>> >> >>> >> On 18 Mar 2014, at 5:06 pm, Joram Okwaro <[email protected]> wrote: >>> >> >>> >> > Hi Guys, >>> >> > >>> >> > I've already replied to a thread that I hope Pat can reply to about >>> >> > this but I thought just in case he's too busy, someone here can help >>> >> > me out in the meantime. I'm having trouble finding concise >>> >> > documentation on how to set up a remote Sphinx server. My main >>> >> > questions are related to Sphinx 3.1.0 and maybe the answer might be >>> >> > that I need to go back to version 2 to set this up painlessly. >>> >> > >>> >> > 1. From my understanding, I need to set up Sphinx on my remote server >>> >> > and also a copy of my Rails application (with Thinking Sphinx of >>> >> > course) in order to index my models. Is this still the case? >>> >> >>> >> Yup. >>> >> >>> >> > 2. If point #1 is the case, how does the indexer index my database >>> >> > which lives on the app server. Unless I have to set up a database on >>> >> > the search server too which doesn't make sense. I'm pretty lost as you >>> >> > can see :) So please help. How would this generally work? That's the >>> >> > big question. >>> >> >>> >> You'll need to have your database accessible remotely - and have the >>> >> appropriate details in config/database.yml. >>> >> >>> >> If you're going to the effort of having Sphinx on its own server, do you >>> >> have the database on its own server too? Perhaps it's worth discussing >>> >> why you want to have Sphinx on its own server? >>> >> >>> >> Cheers >>> >> >>> >> -- >>> >> Pat >>> >> >>> >> -- >>> >> You received this message because you are subscribed to the Google >>> >> Groups "Thinking Sphinx" group. >>> >> To unsubscribe from this group and stop receiving emails from it, send >>> >> an email to [email protected]. >>> >> To post to this group, send email to [email protected]. >>> >> Visit this group at http://groups.google.com/group/thinking-sphinx. >>> >> For more options, visit https://groups.google.com/d/optout. >>> > >>> > >>> > -- >>> > You received this message because you are subscribed to the Google Groups >>> > "Thinking Sphinx" group. >>> > To unsubscribe from this group and stop receiving emails from it, send an >>> > email to [email protected]. >>> > To post to this group, send email to [email protected]. >>> > Visit this group at http://groups.google.com/group/thinking-sphinx. >>> > For more options, visit https://groups.google.com/d/optout. >>> >>> -- >>> You received this message because you are subscribed to a topic in the >>> Google Groups "Thinking Sphinx" group. >>> To unsubscribe from this topic, visit >>> https://groups.google.com/d/topic/thinking-sphinx/vh51ahsbDXA/unsubscribe. >>> To unsubscribe from this group and all its topics, send an email to >>> [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/thinking-sphinx. >>> For more options, visit https://groups.google.com/d/optout. >>> >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Thinking Sphinx" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/thinking-sphinx. >>> For more options, visit https://groups.google.com/d/optout. >> >> >> -- >> You received this message because you are subscribed to a topic in the >> Google Groups "Thinking Sphinx" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/thinking-sphinx/vh51ahsbDXA/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at http://groups.google.com/group/thinking-sphinx. >> For more options, visit https://groups.google.com/d/optout. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Thinking Sphinx" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at http://groups.google.com/group/thinking-sphinx. >> For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/thinking-sphinx. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/thinking-sphinx. For more options, visit https://groups.google.com/d/optout.
